BECOMING A DATA ANALYST

Khan Muhammad Saqiful Alam is a BBA graduate from IBA, DU who went on to pursue a Master’s degree in Operations, Project Management, and Supply
Chain from the University of Manchester in the year 2011. However, after a slight prompt from one of his professors, his interest spurred towards the field of analytics – simulation and credit risk modelling. He came to Bangladesh and joined as a lecturer, and later
became a senior lecturer, at North South University – teaching Operations management, applied statistics
and financial risk management.
Applied statistics introduced Mr. Saqiful Alam to the coding software R. In the year 2016, he went to Perdue University where he was a research scholar and started working on a project with MIT. The project, referred to as ‘The Billion Prices Project’ was where he first got exposed to the concept of Big Data and the different dimensions of Big Data such as machine learning, data science etc.
After a year he came back to Bangladesh and tried to introduce the concept of decision making using data analytics. He revamped the course of Decision Suppor system in NSU and took sessions on training for Light Castle, Upskill, IBA Certification programs such as ACMP, covering the analytics module in those an quite a few other courses. In the year 2018 he joined as an analytics advisor at Intelligent Machines Limited which works with AI and Data-Driven Solutions. That is how he has been trying to help the industry be more data-driven. Right now he is in Singapore where he is doing research on machine learning applications for strategy and at the same time, working as a data strategist and a Trust and Safety program manager at TikTok.

MBR: How do you recognize the prospect of a Data Analyst both in Bangladesh and globally?

K. M. Saqiful Alam: Globally Data analytics has been termed as a ‘hot topic’. The whole world has understood the value of data and they are slowly transforming themselves to be a data driven – that is, they have developed collection processes, data pipelines, data dashboards etc. In a nutshell, organizations are pushing their management and strategy to be more evidence and data driven. They want their management to
support their experience and their intuition with data supported insights. This is happening mostly in developed countries. Developing countries too, such as Bangladesh, is leveling up since there are massive prospects in this arena. If not all, some companies have definitely recognized the importance of data analytics. For example, companies like Pathao or bKash are already data based. Again, companies such as BRAC, GP, Robi etc. have collected their required data and are building their capacities to utilize them using advanced methods. And then there are companies that have identified the prospects but are yet to utilize the benefits. Overall, there is a huge prospect for Bangladesh in the field of data analytics as making decisions are effective using these processes.
This field can be expected to reach its peak in a tentative time span of 4-5 years.

MBR: We often mix up Data Scientist with Data Analyst. Can you please specify the difference for us?

K. M. Saqiful Alam: Since it’s a very new field, the roles don’t exactly have formal definitions. But, overtime, different patterns have been identified that assist in creating some differentiation. University of California, Berkley summarizes the role of a data scientist as having 5 parts: Collection, Storage, Processing and preprocessing, Analysis, and Communication.

Data Collection: Collection of data is, essentially, pipelining the data using any method of data collection into the system’s data base. The focus on this step is to make sure that the right insights are collected, and the limitations of collection are properly communicated to the users of this data.

Storage: Next, the data is stored in a data lake, a data cloud, big data platforms etc. This step also deals with how access to the data will be designed, how data will be eventually delivered to the analyst and some more necessary work. Here the focus in on efficiency and cost minimization – as each
query in the cloud server costs the companies.

Processing and preprocessing: Processing deals with cleaning up data, improving the data quality, and reporting the data properly.

Analysis: Analyzing the data involves extracting the data from the dashboards and pipelines, running analysis on them, using statistical and machine learning tools to generate insights and to aid in decision making.

Communication: Finally, communicating the data effectively across the team, and also to external and internal stakeholders, is an important step which is often overlooked.

Now, a data scientist is someone who has a deeper idea of the 5 steps and is more focused towards answering questions and running scientific research with a data that is provided to them. They end up designing deeper, long term strategies. For instance, a data scientist at Pathao was assigned, not to get busy with day to day problems, rather to use data to optimize the route suggestions and pricing strategies. This enables a data scientist to go deeper into the aspects of data collection and analysis, and get less distracted by the day to day firefighting.On the other hand, a data analyst is someone who stays more at par with the managerial role. This person has to support the day to day or week to week management strategies – which are essentially mid-level strategies. However, they do act as advisors to higher level strategists. A data analyst needs to get data from a data engineer or a cloud storage using which the analysis is made. Simply put, the task of a data analyst involves looking at data dashboards and taking day to day or mid-level
decisions. Ad hoc and weekly problems are catered by them. At times they run machine learning models, but the focus is to support an immediate issue. Whereas, a Data scientist has to go much deeper in order to build a tool or build an overall system or a high level strategy. For example, Netflix has both data scientists and data analysts. A Data scientist in Netflix designs efficient algorithms to effectively recommend the users what to watch, and a Data Analyst looks at the dashboards to see which shows are being rated highly or watched more by the users and analyze the genre of shows that have higher popularity.

MBR: Will you please discuss the steps of becoming a Data Analyst with us?

K. M. Saqiful Alam: In order to be a data analyst, oneneeds to have expertise in 3 areas.

First, it has to start with the basic understanding of statistics. Beyond dashboards and visualizations, statistical analysis is most necessary. Basic concepts
of statistics such as probability, margin of error, confidence interval etc. are some of the things one must have pre knowledge on. Statistical knowledge is necessary to interpret the output of the data analysis. Some sources to achieve this statistical knowledge are: Khan Academy, YouTube channel Data Quest, different courses available in Coursera such as ‘Statistics using R’ – a course by Duke University.

The second step involves getting idea about analytics tools, such as dashboard tools like Tableau (paid access), Power Bi (Free access), Google Data Studio (free access). Looking into the software and language aspect, one needs exposure to programming language
such as R or Python.

Finally, the person will need domain expertise if he/she has a pre-determined area of preference to work in. An important property of an analyst is the ability to truly understand and interpret the data. And for that one needs the domain expertise.

Now, how to get domain expertise?

A person can attain domain expertise by going through data sets available in Kaggle, Google Cloud, UCI Machine Learning depository etc. Kaggle has discussion sections where learners can see how people are analyzing the data and finding insights. The core idea is to read and work on projects in a specific area to get exposure and build domain knowledge.

MBR: What are the roadblocks in general a Data analysis pursuer faces and how to mitigate those?

K. M. Saqiful Alam: Roadblocks and mitigation methods would be:

i) Overcoming the fear that engulfs the thought of being a data analyst – People fear that their educational backgrounds and concentrations will stand as constraints in the process of pursuing a career in data analytics. These are usually never major roadblocks in this case.

ii)Time and commitment- There is no easy way to becoming an expert in this field. Investment in the form of time and commitment will result in expertise. An enthusiast can easily self- learn by enrolling in the different analytics courses available in online learning platforms such as Coursera or Khan academy, or join different training sessions arranged by the organizations such as Upskill, LightCastle and many others etc.

MBR: With time Data Analysis is getting more popularity in the country. What is your opinion regarding this?

K. M. Saqiful Alam: From around 1990 to almost 2010 organizations were in the stage of data collection. In the process, many companies have set up their ERP and MRP systems and collected data for their operational necessity. And as processes became digitized such as payment processes, more and more information got stored. Example: Daraz, by nature of being an ecommerce website, ends up collecting a lot of data related to website visits, customer behavior in the website. Eventually people started to understand that they can use this vast amount of data to make business decisions in order to better design campaigns, better understand customers, predict customer beahvior and dig further into their capability.

MBR: How Data Science can impact core banking business activities like lending, client service, credit rating etc.?

K. M. Saqiful Alam: Talking from experience, I can say that I have worked with a bank in Indonesia for their credit risk analysis department where I looked into the purchasing patterns of their credit card clients. This gave me access to a wide variety of customer specific information such as the customer’s monthly tax payments, their loans, their failures to pay, their company information from where we got an idea of the size of the company as well, their years of experience, home address and a lot more relevant information. Coupled with the existing calculations for credit risk in the bank, the data is used to come up with credit predictions that can determine the probability of certain person defaulting. Another example is the offerings given for cards and accounts. For instance, a specific bank in Singapore is planning the launch of a card called “Your Card” which can be defined as the ultimate personalized card. This card will suggest the card holders’ different offers based on their transaction patterns. In case of customer service, customer specific
suggestions and customer charts can be analyzed to predict how many customers might leave or return, or why they are leaving or staying. Lifetime value
calculation of customers can also be done. Previously customers would have to be clustered but now, because of data science, we can have one specific campaign designed for one specific customer; a campaign which is machine designed.