Below you will find pages that utilize the taxonomy term “Data”
Post
Mountain Bike Categorization Analysis
Introduction Overview The Data EDA Label (Mountainbike Category) Categorical Variables Continuous Variables ~Normally Distributed Variables: Skewed Variables: Multi-Modal Distributed Variables: Average bikes by flip-chip setting Methodology Variation Amongst Featureset 1. Correlation 2. Principal Component Analysis (PCA) Clustering K-Means Gaussian Mixture Model (GMM) GMM - 3 Clusters GMM - 6 Clusters Multi-class SVM Conclusions Findings Opportunities for Improved Analysis Introduction Overview For this post, I worked with Mike Czerwinski to determine whether the specifications of mountain bikes (MTB) are enough to differentiate between the different types of mountain bike categories.
Post
Reading Multiple CSVs into Merged R Dataframe
The purpose of this script is to load and clean all of the various .csv files containing polling place data into R. The data, which is available for download here, is structured as follows:
- Each state (32 in total) has its own folder
- Within each state (folder), there are a variable number of CSV files, one for each year that polling place data is available
Post
U.S. Police-Caused Fatalities
George Floyd. Eric Garner. Tamir Rice.
Those names, among thousands of others, are emblematic of fatal police violence in the United States. The protest movement that has spread like wildfire across the U.S. has brought police brutality to the forefront of everyone’s minds.
Following the deluge of information from news stations and social media, I can’t help but wonder, what does the data say? Are these new trends or longstanding realities?