FALL 2024 REGISTRATION IS NOW OPEN

Review the Academic Calendar for all important Fall 2024 dates.

Statistical Analysis

This Area of Concentration is optional, yet the successful completion of these requirements will allow this concentration to be noted on your official transcript.

Concentration Courses as Electives

You have the opportunity to heavily customize your MS in Data Analytics and Policy degree because most of the courses listed below can satisfy the Elective Courses requirement. If a course is identified with *NOTE then that course cannot be counted as an elective outside of this concentration without prior academic adviser approval.

Area of Concentration Courses

A minimum of four courses are required to earn this Area of Concentration within the MS in Data Analytics and Policy degree.

This course covers the ways in which analytics are being used in the healthcare industry. Topics include data collection opportunities created by the ACA and other laws, the use of analytics to prevent fraud, the use of predictive modeling based on medical records, the insurance industry's increasing use of data and the ethical issues raised by these practices. Prerequisites: none required (470.681 Probability and Statistics recommended)

In this course students will develop expertise in using the tools necessary to collect, analyze, and visualize large amounts of text. The course begins with a hands-on introduction to the programming concepts necessary to collect and process textual data. The course then proceeds to cover key statistical concepts in machine learning and statistics that are used to analyze text as data. Throughout the course, students will develop a research project that culminates in the display of results from a large-scale textual analysis. Prerequisite: 470.681 Probability and Statistics.

Machine learning (ML) and, more broadly, artificial intelligence can now be used to perform complex tasks in data science and social science. This course introduces students to a variety of these machine learning techniques. Students will learn the fundamentals of statistical software used for ML, develop an understanding of how ML works, and will then implement these techniques. Further, students will learn how to select an appropriate ML tool depending on the dataset they have and the question to be answered. Prerequisite (one of the following): 470.681 Probability and Statistics, or 470.854 Fundamentals of Quantitative Methods, or experience with R or Python statistical programming and instructor permission.

This course reviews the mathematical principles that are fundamental to quantitative analysis. The course covers functions, probability theory, integral and derivative calculus and matrix algebra.

This course introduces students to big data management systems such as the Hadoop system, MongoDB, Amazon AWS, and Microsoft Azure. The course covers the basics of the Apache Hadoop platform and Hadoop ecosystem; the Hadoop distributed file system (HDFS); MapReduce; common big data tools such as Pig (a procedural data processing language for Hadoop parallel computation), Hive (a declarative SQL-like language to handle Hadoop jobs), HBase (the most popular NoSQL database), and YARN. MongoDB is a popular NoSQL database that handles documents in a free schema design, which gives the developer great flexibility to store and use data. We cover aspects of the cloud computing model with respect to virtualization, multitenancy, privacy, security, and cloud data management.

Prerequisite: 470.763 Database Management Systems

Technology Requirements: A 64-bit computer with a chip that supports virtualization (set via BIOS) Windows Operating System 7, 8, or 10 At least 8 Gb of Physical RAM Oracle VirtualBox version 4.2 (free) Please be in touch with the instructor with questions about the technology requirements.

Data are everywhere, and many elected officials and government managers understand they need it. But how can they use it to solve problems and shape policy? What is the best way to make decisions based on a data analysis? How can they communicate those decisions, and the rationale behind them, to employees, citizens, and stakeholders? This course will provide students with an experiential learning opportunity based on real-world scenarios. Students will each take on a role (mayor, police commissioner, human capital director, budget director, public works director, public health director) and participate in a simulated public policy scenario. Working in small groups, students will apply a practical performance analytics process to develop solutions to address governmental challenges. Students will begin by studying foundational concepts and techniques of data collection, analytics, and decision support. They will also learn how to navigate multiple interests, asymmetrical information, and competing political agendas as they make difficult decisions about resource allocation and public policy. Along the way, they will learn how to turn insights into action by effectively communicating the results of analysis to busy executives and decision makers at all levels of the organization. Prerequisites: none required (470.681 Probability and Statistics recommended)

This class applies data analytic skills to the urban context, analyzing urban problems and datasets. Students will develop the statistical skills to complete data-driven analytical projects using data from city agencies, federal census data, and other sources, including NGOs that work with cities. We will examine a variety of data sets and research projects both historical and contemporary that examine urban problems from a quantitative perspective. Over the course of the term, each student will work on a real-world urban data problem, developing the project from start to finish, including identifying the issue, developing the research project, gathering data, analyzing the data, and producing a finished research paper. Prerequisite: 470.681 Probability and Statistics

Learning the basics of Python empowers analysts to retrieve and leverage data in new ways. After covering the fundamentals of syntax, students learn how to read, create and edit data files using Python. Building on that knowledge, students interact with online resources through bulk data APIs and web scraping. Finally, students will use the data they collect to develop an original analysis. Prerequisite: 470.681 Probability and Statistics

This course explores technological and data-driven solutions for policy challenges. This includes developments within government, such as the new types of leadership provided by Chief Innovation or Chief Data Officers, the trend toward digitalization of services, and the movement toward open data. It also covers innovation by citizens through the civictechnology movement. Civic tech initiatives have been used to extend and improve services, increase efficiency, design applications for citizen engagement, and improve communication across a variety of policy domains. The course also covers the concept of smart cities and how it can be understood as both new applications of technology (such as sensors and smart infrastructure), and the strategic use of data. For the course project, students will evaluate a policy initiative using city open data, policy research, an analysis of political culture within which the initiative would be implemented, and the technology that could be used for the initiative. Some familiarity with R programming language and theRStudio environment is helpful. Prerequisite (one of the following): 470.768 Programming and Data Management; or experience with statistical programming and instructor permission.

Many government agencies engage in data mining to detect unforeseen patterns and advanced analytics (such as classification techniques) to predict future outcomes. In this course, students will utilize IBM SPSS Modeler to investigate patterns and derive predictions in policy areas such as fraud, healthcare, fundraising, human resource and others. In addition, students will build segmentation models using clustering techniques in an applied manner. Integration with other statistical tools and visualization options will also be discussed. Prerequisite: 470.681 Probability and Statistics; Recommended: 470.709 Quantitative Methods

Analytics inform the decision-making process, strategizing, and forecasting of modern American campaigns. This course focuses on the role that analytics play in campaigns and elections in America. Campaign strategists, policy analysts, and social scientists leverage data from voter rolls, consumption and public opinion polls to make better choices. This course surveys the theoretical and empirical literature in American electoral politics to examine how campaigns and political organizations are using field experiments, microtargeting, and public opinion polling to tackle the challenges of getting out the vote and increasing registration and voting rates. Other topics covered include voting behavior, public opinion, partisanship, and campaign finance. Students will gain a rich understanding of how analytics has become a key component of the electoral process. Students will also gain experience analyzing data through simulations and data analysis exercises. Prerequisites: none required (470.681 Probability and Statistics recommended)

This course provides students with a strong foundation in database architecture and database management systems. Students will evaluate the principles and methodologies of database design and techniques for database application development. Students will also examine the current trends in modern database technologies such as Relational Database Management Systems (RDBMS), NoSQL Databases Cloud Databases, and Graph Databases. Prerequisite: none

This course is a comprehensive examination of all aspects of designing questionnaires, conducting survey research, and analyzing survey data. The class will cover question construction, measurement, sampling, weighting, response quality, scale and index construction, IRBs, ethics, integrity and quality control, modes of data collection (including telephone, mail, face to face and focus groups), post collection processing and quantitative analysis of data (including chi-square and ANOVA), as well as report writing fundamentals. The class culminates by fielding a survey of student created questions and writing an executive summary of the survey with a paper discussing the research findings. Prerequisite: 470.681 Probability and Statistics

Data science is a methodology for extracting insights from data. This course is an introduction to the concepts and tools that are used in data science with an emphasis on their application to public policy questions. The course covers some advanced data mining and machine learning processes including classification and decision trees, random forests, cluster analysis, and outlier detection, while also providing you with training in the basics of data management and data exploration. All of the work in the course will be conducted to prepare you to proficiently conduct predictive analytics in a real-world setting. Some familiarity with R programming language and the RStudio environment is necessary. Prerequisite: 470.681 Probability and Statistics

This course will introduce computational modeling and demonstrate how it is used in the policy and national security realms. Specifically, the course will focus on agent-based modeling, which is a commonly-used approach to build computer models to better understand proposed policies and political behavior. Agent-based models consist of a number of diverse "agents,'’ which can be individuals, groups, firms, states, etc. These agents behave according to behavioral rules determined by the researcher. The interactions with each other and their environment at the micro-level can produce emergent patterns at the macro-level. These models have been used to understand a diverse range of policy issues including voting behavior, international conflict, segregation, health policy, economic markets, ethnic conflict, and a variety of other policy issues. The course will consist of two parts: First, we will examine the theoretical perspective of computational modeling. Second, you will be introduced to a software platform that is commonly used to develop computational, and, in particular agent-based modeling. No prerequisite

Spatial Statistics is a rapidly developing tool in the discipline of ecology that analyzes both 2-D and 3-D data that contain a spatial component. Many ecologists use continuous data (e.g., vegetation density and height, net aboveground primary production, percent of biomass killed by disturbance, etc…) that violates the assumption of spatial independence; therefore, necessitating the need to analyze the data using spatial statistics. Thus, spatial statistics provides concepts, tools, and approaches that will enhance the analyses of population data, sample data, partitioning of regions (patch and boundary), spatial interpolation, and data that are spatially autocorrelated. The goal of this course is to give students a firm grasp of the concepts of spatial statistics in ecology and of how they can be applied to analyze continuous data for environmental policy, management, and assessment. Uses of case studies, data analysis in the R spatial statistics package, and discussions help to examine and apply the concepts.

STATE-SPECIFIC INFORMATION FOR ONLINE PROGRAMS

Students should be aware of state-specific information for online programs. For more information, please contact an admissions representative.

FALL 2024 REGISTRATION IS NOW OPEN

Statistical Analysis

Concentration Courses as Electives

Area of Concentration Courses

STATE-SPECIFIC INFORMATION FOR ONLINE PROGRAMS

Social Navigation

Site Menu

Audience Menu

Search

FALL 2024 REGISTRATION IS NOW OPEN

Statistical Analysis

Concentration Courses as Electives

Area of Concentration Courses

Healthcare Analytics and Policy - 470.624

Text as Data - 470.643

Machine Learning and Neural Networks - 470.667

Math for Data Scientists - 470.669

Big Data Management Systems - 470.694

Applied Performance Analytics - 470.699

Urban Data Analytics - 470.703

Unleashing Open Data with Python - 470.708

Policy, Technology and Innovation - 470.738

Data Mining and Predictive Analytics - 470.743

Data-Driven Campaigns and Elections - 470.758

Database Management Systems - 470.763

Survey Methodology - 470.764

Data Science for Public Policy - 470.769

Computational Modeling for Policy and Security Analysis - 470.779

Spatial Statistics - 420.677 *NOTE

STATE-SPECIFIC INFORMATION FOR ONLINE PROGRAMS

Site Menu

Audience Menu

Search