Hello everyone, my name is Andrew Zimmerman and I am a senior at Benedictine finishing up my Computer Science degree. I work as a web developer for a golf company and also play soccer for BenU Mesa.
This week’s chapter introduced two of the most important tools in all of data science: descriptive statistics and probability distributions. Before working through the examples, I mostly thought of statistics as something abstract or something used only in academic research. But Chapter 5 showed how directly these concepts influence real data analysis, prediction, and model evaluation. Descriptive statistics such as mean, median, variance, and standard deviation give us a quick summary of what’s happening inside a dataset. Whether it’s the heights of a group of people or the ages of Titanic passengers, these values immediately reveal patterns like central tendency and spread—something spreadsheets can compute, but Python handles more efficiently and reproducibly. What really stood out this week was seeing how probability distributions help us model uncertainty. The normal distribution examples made it easy to understand why so many real-world variables, like heights or measurement erro...
This week focused less on traditional coding lessons and more on experimenting with new AI-driven tools like Replit, which allow beginners and non-technical users to build working apps quickly. The big takeaway from this week is that creating an app no longer requires deep programming knowledge, modern tools can generate, host, and edit code with AI assistance. Replit, in particular, makes it extremely easy to start a project, test ideas, and publish something that can be shared online. This is valuable not only for learning but for employability, since being able to say you’ve built and deployed an app, no matter how simple, helps your résumé and LinkedIn profile stand out. Using Replit’s AI features felt surprisingly accessible. You can describe what you want to build in natural language, and the platform generates the foundation of the app automatically. From there, Replit’s “ghostwriter” AI helps refine and debug code in real time. Even if you don’t consider yourself a coder, you c...
This week, I worked through Chapter 3 of the Data Toolkit, which focused on handling and cleaning data using two of the most important Python libraries for data science: Pandas and NumPy. These tools are the foundation of almost every data-driven project, so learning how to manipulate real-world datasets with them is a major step toward becoming comfortable with data work. The chapter explained the difference between Pandas' main structures, Series and DataFrames, and showed how these structures make it easy to import data, explore it, and prepare it for analysis. What stood out to me is how Pandas transforms messy raw data into something structured and usable with just a few lines of code. I also learned how essential data cleaning is before any analysis can even begin. Chapter 3 walked through handling missing values, removing duplicates, renaming columns, and replacing incorrect values. Seeing each of these operations applied in context made it much easier to understand why clea...
Comments
Post a Comment