A year ago I started blogging consistently at about a weekly pace. The result, 61 unique posts, most recently 4000 monthly views and the opportunity to co-author a book. via GIPHY In this post I wanted to review this year in blogging, and share what I learnt. And I hope this motivates you to start […]
Confusion matrix, accuracy, recall, precision, false positive rate and F-scores explained
When building a machine learning model, it’s important to measure the results of your model. Typically, you split a dataset into a training dataset and a test dataset. The training dataset is used to train your model, while the test dataset is used to measure the performance of your model. A commonly used method to […]
How to sync a GitHub fork
I recently needed to sync a GitHub repo I forked to the latest status of the original fork. This is easy to do, but you have to know which buttons to push. To start, open the forked repo in Github. You should see a mention that this branch is behind the original branch. Next to […]
Tuning / regularizing common linear regressions and classifiers in scikitlearn
If you read my article in January about my personal development goals, you might have seen that I’m working to achieve the DP-200 certification. During the learning process for DP-200, I learned that I lacked certain basic knowledge about how to do data engineering / data science in Python. For that reason, I decided to […]
Accessing Key Vault Secrets in Kubernetes using the Key Vault CSI driver
Note: There’s a new post available combining CSI driver + AAD pod identity. When you store secrets in a Kubernetes cluster, by default those are stored in the etcd database within the master nodes. The same is true for secrets stored in an AKS cluster on Azure. The best practice for storing secrets is to […]