Authored by: Sanjay Goel
The previous post on this issue was about the draft curriculum. This new post gives the details of the program being started from 2015-16 after incorporating the suggestions and comments from many experts.
- BTech (in any discipline) or equivalent
- Masters (in Computer Applications/Computer Science/IT/Maths/Statistics/Operations Research/Physics/Electronics/Instrumentation/Economics/Commerce)/Bioinformatics or equivalent
Data Analytics involves using computing for reporting, analysis, monitoring, and prediction. International Data Corporation (IDC) forecasts that business analytics market will reach $50.7 billion in 2016. According to Avendus Capital, US alone is expected to have a shortage of 140,000 – 190,000 analytics professionals by 2018.. NASSCOM predicts that the analytics market in India could reach $2.3 billion by the end of 2017-18.
In India, a good number of consulting, KPO and IT companies are providing specialised services in the field of data analytics. Some are providing general solutions in all segments whereas some others are specialising in specific verticals like banking, retail, marketing, financial, CRM, risk analysis, web, clinical, telecom, insurance, etc. These companies include- IBM Analytics, Mu-Sigma, LatentView, HCL Technologies, Accenture, Genpact Analytics, Cognizant Analytics, TCS Analytics, Wipro Analytics, McKinsey Analytics Knowledge Centre, Deloitte Analytics, PwC Analytics, AbsolutData, Fractal Analytics, iCreate, Dunhumby, Global Analytics, Manhattan Systems, Capillary Technologies, Nabler, Activecubes, ICRA Technology Services, WNS Analytics, Opera Solutions, Data Monitor, Ipsos, EXL Services, Meritus, Modelytics, Bridge i2i Analytics, Cytel, Neural Techsoft (Financial & Risk Analytics), Vehere Interactive, Aegis Global, Datamatics, Marketelligent, TNS Global, NettPositive Analytics, Affine Analytics, EVALUESERVE, Innovacer, Crisp Analytics.
In addition, many other captive companies/groups provide solutions to their parent organisation like. These include – HSBC Analytics, Citi Bank Analytics, American Express, Fidelity Analytics, GE Capital, RBS Business Services, Barclays Shared Services, Target Analytics, Spencer Analytics, Amazon Analytics, Dell Analytics, HP Analytics, eBay / PayPal, Experian India, Fair Isaac India, Dun & Bradstreet.
Further, a large number of new IT ventures are starting in this area to fill the global demand and talent gap in this area. Telangana state government has already announced a knowledge park on Data Analytics.
MTech (Data Analytics) is an interdisciplinary program offered by department of CSE & IT and is designed to meet the huge manpower shortage in this area that has been well recognized as one of the fastest growing areas. All business and government organisation working in commerce, policy, insurance, finance, economics, engineering, infrastructure, energy, health care, education, security, sports, media, culture, etc. are increasing relying on computational tools and techniques of data analytics for taking informed decisions.
This program has been designed to develop the ability to apply and develop computational techniques and systems to draw insights from big data in a variety of application domains. It is expected that the graduates of the proposed program will mostly join the fast growing Data Analytics industry as technical experts and few will even launch their own IT start-ups in this field. Some graduates will also choose to go for doctoral studies and/or pursue their career in academia. The available and growing academic opportunities being developed through our ongoing MTech programs in ‘CSE’ and ‘IT and Entrepreneurship’ will also be leveraged to enrich this program with advanced level computer science elective courses in various areas on one hand and design and entrepreneurship related elective courses on the other. Jaypee Centre for Entrepreneurship Development (JCED) will also support the students’ entrepreneurial activity that spins off from this program.
The curriculum exposes students with all aspects of data analytics including research design, data collection, preparation, analysis, integration, visualization, and interpretation. In addition to the CSE & IT department, the department of mathematics as well as business school/department of HSS will also contribute courses for this program. The core courses include statistical data analysis, financial econometrics, data warehousing and data mining, pattern recognition and machine learning, large scale graph analytics, empirical research and laboratories. Students will also be offered several electives on theoretical, systemic, algorithmic, and applied aspects of data analytics. In addition to the CSE & IT department, the department of mathematics as well as business school/department of HSS will also contribute courses for this program. The curriculum is strongly oriented towards industry and emphasises industry internship, projects, and entrepreneurship.
|1||Bridge Course (one of the following, depending upon the background):Econometrics (Not open for MA Economics) or
Programming Foundations for Data Sciences (not open for BTech(CS/IT))
|2||Mathematics for Data Analytics||3||–||–||3||3|
|3||Data Warehousing and Data Mining||3||–||–||3||3|
|5||Elective – I||3||–||3||3|
|6||Elective – II||3||–||–||3||3|
|7||Data Analytics Lab||–||–||4||4||2|
|2||Big Data Technologies||3||–||–||3||3|
|3||Large Scale Graph Analytics||2||–||–||2||2|
|4||Elective – III||3||–||–||3||3|
|5||Elective – IV||3||–||–||3||3|
|6||Elective – V||2||–||–||2||2|
|7||Advanced Data Analytics Lab orIndustrial Internship -I (part time)||–||4||4||2|
|1||Industrial Internship-II (6 months full time, to start in summer break)||–||–||40||40||14|
|1||Dissertation or Industrial project||–||–||50||50||18|
|Total Credits = 70|
1st Sem Electives (Elective I and II):
- Web Algorithms
- Information Retrieval
- Intelligent Systems
- Multimedia Analytics
- E-Commerce and Social-Web
- Algorithmic Graph Theory
- Mobile and Pervasive Computing
- Distributed Systems (A Core course of MTech (CSE))
- High Performance Software Engineering (A Core course of MTech (CSE))
- Activity Centred Design and User Experience (A Core course of MTech (ITE))
- IT Venture Creation (A Core course of MTech (ITE))
- Data Analytics for Bioinformatics (by Biotech)
- Socio-Political Analysis (by HSS)
- Political -Economic Analysis (by HSS)
- Demographic Analysis (by HSS)
2nd Sem Electives (Elective III, IV, and V):
- Advanced Machine Learning
- Interactive Data Analysis
- Ecommerce and Social Media Analytics
- Mobile and IoT Analytics
- Forensic Analytics
- Information Integration and Visualization
- Computer Vision
- Machine Translation
- Predictive Analytics
- Web Services and Cloud Computing
- Cryptography and Computer Security
- Advanced Algorithms (A Core course of MTech (CSE))
- Performance Evaluation of Computing systems (A Core course of MTech (CSE))
- IT Product Innovation and Creativity (A Core course of MTech (ITE))
- Game Theory and Business Intelligence (by JBS)
- CRM Analytics (by JBS)
- Financial analysis (by HSS)
- Ethics & laws in Social Data Analysis (by HSS)
Electives can be added flexibly.
Any other core/ elective course offered to MTech (CSE)/MTech (IT&E)/MTech (ACM) can be opened for this program as well.
Syllabus of Bridge and Core courses:
- Econometrics: (3 credits Bridge Course, not open for students with MA Economics): Predicting Economic Outcomes with Simple Regression Model, Testing Economic Theories and Evaluating Policy Effects with Multiple Regression Analysis, Multiple Regression Analysis: Further Issues, Analysis of Qualitative Data using Dummy Variables, Improving the Econometric Model: A case of Heteroscedasticity, Econometric Model Specification and Data Issues, Modelling Jointly Determined Explanatory Variables: Simultaneous-Equation Models and Pooled/Panel Data Models, Computational Tools like R/Python/Eviews.
- Programming Foundations for Data Sciences: (3 credits Bridge Course, not open for students with CS/IT background) : HTML5, DHTML, Overview of Python, Operators and Expressions, Flow Control, Sequence Data (list operations, list methods, Strings sets, Dictionaries), Functions, Files, Errors and Exception Handling, Modules, Regular Expressions, Object oriented programming using Python, GUI, Database connectivity: Python Database API to connect MySql Database. Python XML processing: XML parser Architecture, Parsing XML with SAX & DOM APIs, Introduction to cloud environment.
- Mathematics for Data Analytics: (3 credits): Random Variable, Discrete and Continuous distributions, Mean and Variance of a Random variable, Binomial, Poisson, Exponential and Normal distributions. Sampling theory and Sampling distributions, Correlation, Regression analysis, Simple linear regression models. Random Vectors, Multivariate Normal Density Function and Multivariate Normal Distributions, Chi-square, F- and t- distributions. Hypothesis Testing and Confidence Intervals, Coefficient of Determination and Multiple Regression, Normal Model. Random Processes, Poisson Process, Markov chains and their transition probability matrix(TPM). Linear Programming Problems: LPP, Simplex Method, Dual Simplex Method, Sensitivity Analysis, Integer Programming, Games Theory, Network Scheduling by PERT and CPM. Random numbers. Monte-Carlo methods. Computational Tools in Matlab. User defined functions. Programming for basic computational methods such as eigenvalues and vectors of matrices, Sparse matrices, QR and SVD. Interpolation by divided differences. Gauss-Quadrature.
- Data Warehousing and Data Mining: (3 credits): Schema integration, , Data Mart/Data Warehouse, Metadata, OLTP Vs OLAP, Data Summarization, Data cleaning, de-duplication, Dimensional Modelling, Fact Tables, Star schema, Snowflake schema, OLAP, ROLAP, MOLAP, online aggregation, From Data Warehousing to Data Mining, Data Cube Computation and Data Generalization, MDX, Classification, Clustering, Computational Tools for Integration Management (ETL, EAI, etc), Information Management (DW/Mart/ODS, OLap Servers, etc), Information Delivery (Portal, Dashboard, Analytics/OLAP Client, etc), and data mining (e.g., RapidMiner, Weka, etc.).
- Machine Learning: (3 credits): Bayesian reasoning, Parametric methods, Supervised learning, Decision trees, Dimensionality reduction, mixture densities and expectation maximisation, Unsupervised learning, Optimisation and search, ANN- multilayer preceptron and back propagation network, Kernel Machines, Evolutional learning, Reinforcement learning, Deep learning, Computational Tools, e.g., Octave.
- Empirical Research: (3 credits): Philosophy of Knowledge and Science, Argument and fallacies, Frameworks of Critical Thinking, Research thinking, Formulating a research problem, IT assisted Informative VS Normative research, Case Study, Experiment design, Controlled experiment, Quasi experiment, IT assisted sampling, Confounding variable, Qualitative and quantitative data, IT assisted data collection (direct, indirect, independent), IT assisted data analysis, IT assisted data visualisation, Concept mapping tools, Theory building, Falsifiability, Validation, Validity threats, Web based empirical research, Empirical research in software engineering, information systems, usability, e-commerce, Social media, business, policy making, and social sciences, Introduction to some of the following open source empirical research tools- ODK, QDA Miner Lite, CAT, CATMA, Aquad, Compendium, QCAmap, Octave, PSPP, hackystat, etc.
- Big Data Technologies: (3 credits): Big data, Structured data, Unstructured data, Storage and indexing, of massive databases, Algorithmic topics e.g., sketching, hashing, random projections, etc. Massive spatial, multimedia, and real time databases, SQL extensions, NoSQL databases, Massive parallel processing, Big data programming- Hadoop/HDFS, Map-reduce, Apache Spark, event processing, Big Data Visualisation, Big Data Analytics, Applications of Machine learning techniques.
- Large Scale Graph Analytics: (2 credits): Large scale graph representation and storage, Social networks, Indexing techniques for large graphs, graph compression, query processing, link analysis, evolving graphs, heterogeneous graphs, integrating graph with non graph data, distributed graph management, large scale graph visualisation, Applications of Machine learning techniques, Computational Tools, e.g., Gephi, HyperGrpahDB, Neo4j, etc.