Why Data Science has captured the world?
Updated: Dec 23, 2022
Time is the most important factor in my life. In a more than 40-hour work week, it is hard to maintain a work-life balance as such, and to add to that the aspiration to become a data scientist reaches a whole new level. But I persist in working for that aspiration.
Data science has become so vast and varied that to maintain mastery in a specific domain becomes a challenge. I prepare a time table to study for it in an inexpensive way, as time is a precious element. Being in corporate life for some time now, I have realized that goals and objectives in this arena are far different from the academics. Research can go wrong in the industry, but deadlines need to be completed. To strive for perfection and being devoid of mistakes is the aim. It is exhausting, since in academics, I never felt the need for perfection, but a real joy in whatever I was learning. But then, that joy didn't pay my bills in the end. So, in a sense, it is always a give-and-take.
Where does my intense desire to play with numbers and graphs come in the picture then? The answer lies to what I have been doing in the industry. There is so much data that is waiting to play its role in the battleground, and needs soldiers like us to triumph. A data in itself is ordinary, but when combined with visuals, statistical inferences and models, can create an ever-lasting impact (not to mention, can also generate wealth).
Many people follow old and obsolete ways to represent data. A simple line curve to represent a table of data does not go beyond just representing it in a 'nicer' way. One needs to understand why different graphs were created and what purpose did they solve. A 100% stacked graph, for example, is the best to represent composition when the sum totals do not vary much. To use it for varied sums will still make sense, but it would not be reliable. On the other hand, a normal stacked graph is the best to represent a trend in the composition as well as the sum.
To develop this sense requires time and effort. Industrial insights from experienced seniors definitely helps, but an individual attempt is a must for a deeper understanding. Many people tend to do bare-minimal work of creating simple plots and live a life on the edge. These people still complete deadlines, but their work is simply mediocre. I do not say that we should create complicated plots that takes the bejesus out of your boss to interpret. What I say is stating a simple data in an artistic way that makes a unique selling point for the work that you did and the representation of the work which will fetch more applauds.
If a magazine like Reader's Digest would be simply text and no visuals, I would hardly find it an interesting read. Similarly, a data represented blandly will be boring and make the boss wish that he existed in a parallel universe after the presentation.
For a data enthusiast, uncomplicating data is a motivation. As a real example, while making sense out of chromatograms, I found that most of the relevant data could not be represented using the system software, if someone new like me didn't know how to use it. But using Python visualization tools, I used the CSV files from the chromatographic runs, tweaked some Pythonic code here and there, and got the plot with multiple parameters represented on a single graph. Now, I didn't have to search for an answer when the boss asked 'Where is this parameter's data? Plot doesn't make sense without it!'.
For any employee, it is important to correlate their data representation with the audience. For an audience like a senior technical personnel, a lot of fact-based information is good, but for someone like the CEO, a unique visualization that proves the point is both time-saving and effective (imagine the time the CEO has to spend in scanning thousands of data tables and plot sent by hundreds of employees). A visualization should be capable of telling a story by itself, while instilling curiosity in the observer. A visual hook in a presentation goes a long way in the memory than a bunch of numbers, line graphs, and facts stated without insight.
Comments