Dataset Search by Google

As a grad student on a never ending quest for the best datasets for research projects and analyses, I’ve often find myself yearning for a more streamlined approach to looking for data. Which is why I was super excited to read about Google’s Dataset Search Service. Check it out: https://toolbox.google.com/datasetsearch, or read more about it on The Verge.

This is great news for research and academic communities, and I’m happy it’s come just in time for me to write my thesis 😉

Why I chose to pursue the QMSS Program at Columbia For Grad School

If the 2016 presidential election taught me anything about data analysis and the social sciences, it is that the intersection of these two fields is both powerful, and unpredictable. In fact, many institutions, from governments and businesses, to think tanks and universities, increasingly rely on the marriage of these two disciplines in order to understand how people think, feel and behave, and subsequently, leverage this information to strategize, innovate and understand. The aftermath of the elections is particularly compelling because it caused an entire nation to not only question the information and perceptions they had, but also the predictive tools, models, and methods of analysis available to them throughout the election process. Additionally, such an unexpected outcome continues to have profound implications about the social, political, and economic landscape of the United States. This type excitement and intrigue is not only limited to politics, however, but it is exactly why I chose to study at Columbia University’s Quantitative Methods program with a focus in Data Science. The field is not only fascinating, but extremely important.

Prior to grad school, I delved further into various fields of research including public health policy, and outcome measure development. My experiences thus far have taught me how crucial and complex data and data analysis can be in many different disciplines, but more specifically, how important it is in the areas of healthcare and medicine. For example, from September 2014 to July 2017, I worked at the Yale Center for Outcomes Research and Evaluation, and had the opportunity to work alongside many physicians, scientists, and statisticians, to develop measure methodologies that help evaluate the performance and quality of hospitals across the United States. A major part of our work involved understanding and applying statistical methods and science to patient data and health outcomes with the goal of measuring, evaluating and improving healthcare quality. Through my work, I learned about various types of analytic models and approaches, from hierarchical generalized linear models (HGLM), to Markov chain Monte Carlo (MCMC), and the merits and obstacles of applying various tools and techniques to the social and medical characteristics of our patient cohorts. My work also helped me understand how policies that affect not only the bottom lines of hospitals, but the lives of millions of patients every day, are developed and implemented.

One of the teams that I worked with was trying to develop and harmonize a complex measure model that is both statistically sound, and clinically accurate. It was exciting and challenging to discover that even though our methods had been vetted and tested, our models can still face issues with convergence, or scrutiny from experts in various fields questioning our approach, validity or reliability. Thus demonstrating that when you are dealing with people and data, you can always expect some level of uncertainty. Another team I worked with used statistical analyses to create an interactive data visualization website that helps educate policy makers, researchers, journalists and stakeholders, about healthcare outcomes and the factors that affect a hospital’s performance. While working on this team, I learned how to take complex data and concepts and turn them into something dynamic and user-friendly.

Continuing my educational pursuits at Columbia has also been personally meaningful to me. During my undergraduate experience, I took classes with professors who became valued mentors, and gained exposure to many interdisciplinary fields of study. Although my educational experiences were at times very challenging, I truly value the emphasis that Columbia places on intellectual curiosity and academic excellence. One of my favorites aspects of our curriculum is the way students at Columbia are encouraged to blend the humanities and sciences and understand how different fields affect one another.

My greater goal is to contribute to the field of computational social science, whether it be in health, tech, or both, in meaningful ways in order to understand the world and leave a positive impact on people’s lives and well-being. My desire to do meaningful and impactful work is a value that I was taught through my academic experiences and leadership roles at Columbia.

As science and technology advances, we are gaining access to even more information about each other than we could have previously imagined. With so much information at our fingertips, the need for analyzing, synthesizing and interpreting data from our dynamic social landscapes, or medical repositories has grown exponentially as well. From personalized medicine to the composition of the human genome, opportunities for treating patients and improving people’s health through data makes the study of quantitative methods in the social sciences progressive and essential. Therefore, I’m really excited about the skills I’ve been learning and the problems I’ve been attempting to understand and solve during grad school. I’m also really looking forward to applying these skills and critical analyses across even more topics and fields.

 

 

 

 

Some Reflections

As I begin my third semester of grad school, I’ve decided to write down and share some reflections I’ve had during my program. So far I’ve really loved my experience at Columbia. Grad school here is wayyy better than undergrad and it’s fun to study something you like with a clear set of goals and objectives (so stay in school, kids).

I began my program with little experience in coding, but was able to learn things quite fast- a lot of what we did in my classes was clear, intuitive, and enjoyable. By having a social science approach to the problems we solved, I found the work super interesting and relatable.

Some of my favorite classes included: Data Visualization, Applied Data Mining (or Machine Learning), Social Network Analysis, and Advance Analytic Techniques. Some languages/technical skills I’ve been able to pick up in these classes include R, Python, SQL, Regression Analysis, and more.

 

Image result for in god we trust, all others bring data