Applied Data Science Portfolio.
While I studied for my masters in Applied Data Science at the School of Information Studies, I was required to work on a variety of data-related projects ranging from Information Visualization to Text Mining. So, I thought it will be nice for me to display my best work that showcases my particular skills here.
Keep in mind that of all these completed projects here are not included in my final exit requirement essay and video/presentation.
Make sure to click on “Learn More” to find anything related to that particular project.
Towards the bottom, you will see the exit requirement deliverables that was submitted to a panel of faculty for review in my final graduate semester (Spring 2019).
Data Mining.
Figuring out the Next Type of Crime in Queens, New York
I worked on this project alongside with Adam Miller, Max Gerstman, and James Lu. My main role was to make everyone's life easier in a sense of efficiency.
We were all interested in figuring out what is the next type of crime specifically in a bar or nightclub in Queens, New York. Our data-set was New York City Crimes (2014-2015). We did both data modeling and descriptive modeling to display our results. We had to overcome is the fact the data-set we chose had over 1 million crime records.
Information Visualization.
Determining if 'Stars' affects 'Is_Open'
I worked on this project myself. My role was to make a 24' by 36' poster that displays that I am able to analyze and visualize a large data set.
Based off of 2017 Yelp Business Data, I tried to determine if the rating of a place on Yelp will impact the rate of place closures. Additionally, I wanted to determine the highest closing rate based on the ratings I selected.
Data Warehouse.
Developing a Bank Data Warehouse
I worked on this project alongside with Fernando Granato, Akshit Salian, and Rohit Anchan. My main role was to apply my Database Management skills and Information Based Organizations knowledge to a fictitious data warehouse.
We were delegated to design and improve a legacy bank platform to a new data warehouse platform. Although there is no data set that was used for this project, we needed to look into different bank data warehouse applications that were made in the real world. We extracted the legacy system, transformed the old data to match new system needs, and loaded into our new data warehouse.
Text Mining.
Analyzing Past Syracuse University Commencement Speeches
I worked on this project alongside Ryan French. My main role was to find, transcribe, and find insight into the 16 different Syracuse University Commencement Speeches (2002-2017).
We wanted to determine what are the similarities and important elements of each of the speeches since as we were going to be Syracuse University Alums soon. We utilized the LDA algorithm for topic modeling and speech similarity.
Data Analysis & Decision Making.
Foreseeing Zomato Restaurant Rating Data
I worked on this project alongside with Shanshan Ma, Rohan Mahajan, Pranjali Nag, and Minyang Wang. My main role was to apply my statistical knowledge to learn more about the chosen data set.
We were all focused on what was the aggregate restaurant rating based off six different variables within the dataset. Our data-set was Zomato Restaurant Data. We put to use three different regression models to help us envision what the aggregated rating was going to be.
Big Data Analytics.
Making Your Next Flight Easier
I worked on this project alongside with Adam Miller, Poorvi Varma, and Alex Liu. My main role was to research and develop four different big data models.
We all wanted to calculate what is the probability of the users inputted flight information going to be delayed or not as well as what it thinks is the outcome. Our data-set was 2015 Flight Delays and Cancellations. We need to sample size the data set due to the fact of how big it is and dedicated resource was available.
Metadata.
Forming a Data Product Application Profile
I worked on this project myself. My role was to apply class concepts to craft my own application profile for Data Products.
Some of the project deliverables I had to do include creating an encoding schema, a crosswalk of elements, and an example of an XML file. There are a variety number and categories of data set out in the interwebs. One of the biggest struggles I noticed is it is there is a lack of consistency across the board since there is no common metadata schema out there.
Principles of Management Science.
Routing the Shortest Bar Crawl for Downtown Syracuse, New York
I worked on this project alongside with Christopher Smith, Jason Ezzari, Ashley Champagne and Ling Yang. My main role was to sell the idea of finding the shortest path or in this case, the shortest bar crawl of Downtown Syracuse, New York.
We were all wanted to locate what is the shortest path of going around 15 different bars, starting from Dinosaur BBQ and ending at The Evergreen. We needed to research what are some past bar crawls looked liked to confirm we were not going to go crazy. We took advantage of Google Earth Pro to get our distances, Solver in Excel, and integer programming tool within Excel QM to achieve our final results.
It doesn’t end here…
As I finished my masters in Applied Data Science, I was required to do two different exit requirement deliverables:
Exit Requirement Essay.
Exit Requirement Video/Presentation.
Personal GitHub.
As Data professional, I have created a personal GitHub that is Data Science focused.
All these past projects are currently on my GitHub. I have made a dedicated repository. Please note I working on a permalink so, you are not just looking at Markdown files.