This Area of Concentration is optional, yet the successful completion of these requirements will allow this concentration to be noted on your official transcript.

Concentration Courses as Electives

You have the opportunity to heavily customize your MS in Data Analytics and Policy degree because most of the courses listed below can satisfy the Elective Courses requirement. If a course is identified with *NOTE then that course cannot be counted as an elective outside of this concentration without prior academic adviser approval.

Area of Concentration Courses

A minimum of four courses are required to earn this Area of Concentration within the MS in Data Analytics and Policy degree.

In this course students will develop expertise in using the tools necessary to collect, analyze, and visualize large amounts of text. The course begins with a hands-on introduction to the programming concepts necessary to collect and process textual data. The course then proceeds to cover key statistical concepts in machine learning and statistics that are used to analyze text as data. Throughout the course, students will develop a research project that culminates in the display of results from a large-scale textual analysis. Prerequisite: 470.681 Probability and Statistics.

Machine learning (ML) and, more broadly, artificial intelligence can now be used to perform complex tasks in data science and social science. This course introduces students to a variety of these machine learning techniques. Students will learn the fundamentals of statistical software used for ML, develop an understanding of how ML works, and will then implement these techniques. Further, students will learn how to select an appropriate ML tool depending on the dataset they have and the question to be answered. Prerequisite (one of the following): 470.681 Probability and Statistics, or 470.854 Fundamentals of Quantitative Methods, or experience with R or Python statistical programming and instructor permission.

This course reviews the mathematical principles that are fundamental to quantitative analysis. The course covers functions, probability theory, integral and derivative calculus and matrix algebra.

This course introduces students to big data management systems such as the Hadoop system, MongoDB, Amazon AWS, and Microsoft Azure. The course covers the basics of the Apache Hadoop platform and Hadoop ecosystem; the Hadoop distributed file system (HDFS); MapReduce; common big data tools such as Pig (a procedural data processing language for Hadoop parallel computation), Hive (a declarative SQL-like language to handle Hadoop jobs), HBase (the most popular NoSQL database), and YARN. MongoDB is a popular NoSQL database that handles documents in a free schema design, which gives the developer great flexibility to store and use data. We cover aspects of the cloud computing model with respect to virtualization, multitenancy, privacy, security, and cloud data management.

Prerequisite: 470.763 Database Management Systems

Technology Requirements: A 64-bit computer with a chip that supports virtualization (set via BIOS) Windows Operating System 7, 8, or 10 At least 8 Gb of Physical RAM Oracle VirtualBox version 4.2 (free) Please be in touch with the instructor with questions about the technology requirements.

This class applies data analytic skills to the urban context, analyzing urban problems and datasets. Students will develop the statistical skills to complete data-driven analytical projects using data from city agencies, federal census data, and other sources, including NGOs that work with cities. We will examine a variety of data sets and research projects both historical and contemporary that examine urban problems from a quantitative perspective. Over the course of the term, each student will work on a real-world urban data problem, developing the project from start to finish, including identifying the issue, developing the research project, gathering data, analyzing the data, and producing a finished research paper. Prerequisite: 470.681 Probability and Statistics

Learning the basics of Python empowers analysts to retrieve and leverage data in new ways. After covering the fundamentals of syntax, students learn how to read, create and edit data files using Python. Building on that knowledge, students interact with online resources through bulk data APIs and web scraping. Finally, students will use the data they collect to develop an original analysis. Prerequisite: 470.681 Probability and Statistics

This course explores technological and data-driven solutions for policy challenges. This includes developments within government, such as the new types of leadership provided by Chief Innovation or Chief Data Officers, the trend toward digitalization of services, and the movement toward open data. It also covers innovation by citizens through the civictechnology movement. Civic tech initiatives have been used to extend and improve services, increase efficiency, design applications for citizen engagement, and improve communication across a variety of policy domains. The course also covers the concept of smart cities and how it can be understood as both new applications of technology (such as sensors and smart infrastructure), and the strategic use of data. For the course project, students will evaluate a policy initiative using city open data, policy research, an analysis of political culture within which the initiative would be implemented, and the technology that could be used for the initiative. Some familiarity with R programming language and theRStudio environment is helpful. Prerequisite (one of the following): 470.768 Programming and Data Management; or experience with statistical programming and instructor permission.

Analytics inform the decision-making process, strategizing, and forecasting of modern American campaigns. This course focuses on the role that analytics play in campaigns and elections in America. Campaign strategists, policy analysts, and social scientists leverage data from voter rolls, consumption and public opinion polls to make better choices. This course surveys the theoretical and empirical literature in American electoral politics to examine how campaigns and political organizations are using field experiments, microtargeting, and public opinion polling to tackle the challenges of getting out the vote and increasing registration and voting rates. Other topics covered include voting behavior, public opinion, partisanship, and campaign finance. Students will gain a rich understanding of how analytics has become a key component of the electoral process. Students will also gain experience analyzing data through simulations and data analysis exercises. Prerequisites: 470.681 Probability and Statistics

This course provides students with a strong foundation in database architecture and database management systems. Students will evaluate the principles and methodologies of database design and techniques for database application development. Students will also examine the current trends in modern database technologies such as Relational Database Management Systems (RDBMS), NoSQL Databases Cloud Databases, and Graph Databases. Prerequisite: none

This course is a comprehensive examination of all aspects of designing questionnaires, conducting survey research, and analyzing survey data. The class will cover question construction, measurement, sampling, weighting, response quality, scale and index construction, IRBs, ethics, integrity and quality control, modes of data collection (including telephone, mail, face to face and focus groups), post collection processing and quantitative analysis of data (including chi-square and ANOVA), as well as report writing fundamentals. The class culminates by fielding a survey of student created questions and writing an executive summary of the survey with a paper discussing the research findings. Prerequisite: 470.681 Probability and Statistics

Data science is a methodology for extracting insights from data. This course is an introduction to the concepts and tools that are used in data science with an emphasis on their application to public policy questions. The course covers some advanced data mining and machine learning processes including classification and decision trees, random forests, cluster analysis, and outlier detection, while also providing you with training in the basics of data management and data exploration. All of the work in the course will be conducted to prepare you to proficiently conduct predictive analytics in a real-world setting. Some familiarity with R programming language and the RStudio environment is necessary. Prerequisite: 470.681 Probability and Statistics

Spatial Statistics is a rapidly developing tool in the discipline of ecology that analyzes both 2-D and 3-D data that contain a spatial component. Many ecologists use continuous data (e.g., vegetation density and height, net aboveground primary production, percent of biomass killed by disturbance, etc…) that violates the assumption of spatial independence; therefore, necessitating the need to analyze the data using spatial statistics. Thus, spatial statistics provides concepts, tools, and approaches that will enhance the analyses of population data, sample data, partitioning of regions (patch and boundary), spatial interpolation, and data that are spatially autocorrelated. The goal of this course is to give students a firm grasp of the concepts of spatial statistics in ecology and of how they can be applied to analyze continuous data for environmental policy, management, and assessment. Uses of case studies, data analysis in the R spatial statistics package, and discussions help to examine and apply the concepts.
STATE-SPECIFIC INFORMATION FOR ONLINE PROGRAMS

Students should be aware of state-specific information for online programs. For more information, please contact an admissions representative.

Audience Menu