Unit 8: Programming Project


Importance of an MSc in Data Science



1 - Description:

I worked on a report titled 'Predicting Customer Responses in Telemarketing,' which involved analyzing a dataset from a bank in Peru. This dataset contained the outcomes of thousands of telemarketing campaign calls aimed at encouraging customers to open a long-term deposit account. My primary objective was to explore customer attributes that influenced their decision to accept the offer and develop a model using R to predict the call's outcome based on variables like occupation and call duration. The process involved thorough data exploration, visualization, and building a predictive model, which I later tested and reported on.


2 - Feelings:

At the outset, I felt optimistic because this project allowed me to apply the principles and techniques I had learned in my previous modules. I took proactive steps to begin early, ensuring I wouldn't have to rush near the deadline, which gave me confidence. However, apprehension set in as I approached the model-building phase. While I was comfortable exploring the dataset and visualizing insights, I was unfamiliar with creating and testing models. My apprehension grew as the deadline neared, especially since I was relatively new to both academic writing and data science, lacking a formal bachelor's degree and having been homeschooled. To overcome these feelings, I attended all the seminars, actively asked questions, and sought guidance from my professor, who provided invaluable feedback that helped boost my confidence. This combination of effort and mindset allowed me to steadily improve the quality of my work. Even though I had to revise my report multiple times, I felt more confident after receiving constructive feedback from my professor.


3 - Evaluation:

The early stages of the report went smoothly, largely due to my decision to start early. I had ample time to explore the dataset and learn R. This phase was slow and relaxed, and I spent the first two weeks familiarizing myself with R terminology, which was essential for proper analysis and visualization. However, challenges arose during the second part of the project—building and testing a model. I encountered difficulties understanding the model summary’s statistical terminology, which required extra research. Additionally, the model's performance was initially poor, but I was determined to find a solution. I spent time experimenting with different variables, eventually diagnosing the issue and adjusting the model’s threshold, which improved its performance. I was proud of this progress, even though the model wasn’t as accurate as I hoped. Submitting the report to my professor for feedback and making necessary revisions proved crucial in refining the report and ensuring it met academic standards.


4 - Analysis:

Starting the project early gave me the space and time to explore the dataset thoroughly, which I believe was a key strength in my approach. I took a structured and methodical approach, learning R from scratch as I explored the data. While this initially felt like a disadvantage compared to my familiarity with Python, it turned into a learning opportunity, proving that I could adapt and pick up new languages when needed. One of the main issues I faced was getting carried away with surface-level exploration, leaving less time to delve deeply into more impactful insights. The model’s initial poor performance could likely have been improved by exploring advanced techniques, such as SMOTE, but time constraints limited my ability to do so. Despite these setbacks, feedback from my professor highlighted the importance of being specific with my insights and recommending actionable steps. His advice helped me focus on making my report more solution-oriented, particularly in terms of advising the bank on how to use the insights to achieve its goals.


5 - Conclusion

In retrospect, I could have better managed my time by balancing data exploration with deeper insight analysis. Spending too much time on the initial exploration phase meant that I rushed certain parts of the report toward the end, especially the modeling. Had I started working on the model earlier, I might have been able to explore more advanced techniques, which could have yielded better results. However, the early feedback from my professor proved invaluable in shaping the final report, and I'm pleased that I took the initiative to submit drafts for review. This process taught me that feedback and revision are key elements in producing a polished, well-structured report. Going forward, I now understand that simply presenting insights isn't enough; it's equally important to propose specific actions based on those insights.


6 - Action Plan:

For future projects, I will prioritize striking a balance between exploring data and delving deeper into promising insights earlier in the process. I plan to allocate more time to experimenting with advanced techniques for handling model performance issues, particularly in cases of data imbalance, such as using SMOTE earlier in the process. Additionally, I will continue to seek feedback throughout the development of my reports, as this helped me refine my approach and stay aligned with the assignment's objectives. In terms of learning new skills, I’ve gained confidence in my ability to quickly adapt to new programming languages, so I’ll carry this mindset forward when tackling similar tasks in the future. By incorporating a more structured timeline and ensuring that each stage of the project receives adequate focus, I believe I can achieve even better outcomes in future assignments.