CS 612 Data Exploration and Visualization

As the base of data science, data should be acquired, integrated, pre-processed, analyzed, and visualized. The data acquisition is a crucial step to ensure both the quantity and quality of data and improve the effectiveness of the following steps of data processing. For the data scientist, it is also important to be aware of the range of options and possibilities and to be able to deploy the analyses as appropriate. Thus, a data scientist must understand concepts and approaches of data acquisition, including data shaping, information extraction, information integration, data reduction and compression, data transformation, as well as data cleaning. Through the use of graphs and other forms of diagrams, visualization can be used in providing readily understood summaries but can also greatly assist in guiding such activities as clustering and classification.

Credits

3