index.json


    
    
    
    
    
    
    
    [{"authors":null,"categories":null,"content":"Hello and welcome to my personal website! I am an aspiring biostatistician interested in incorporating digital technologies into healthcare systems for monitoring health and tracking disease progression. As a master’s degree candidate at the Harvard Chan School of Public Health Department of Biostatistics, I have developed a multifaceted skill set of quantitative methods. I currently work on JP Onnela’s team where I support Beiwe - a research platform that gathers and analyzes physical and other data from participants’ iOS and Android mobile devices. In May of 2023, I graduated from Brandeis University with two Bachelors of Science in Neuroscience and Biology and minors in Computer Science and Chemistry. I grew up in Moscow, Russia, attending the Anglo American School of Moscow and graduating with an International Baccalaurreate diploma.\n","date":1724803200,"expirydate":-62135596800,"kind":"term","lang":"en","lastmod":1724803200,"objectID":"2525497d367e79493fd32b198b28f040","permalink":"","publishdate":"0001-01-01T00:00:00Z","relpermalink":"","section":"authors","summary":"Hello and welcome to my personal website! I am an aspiring biostatistician interested in incorporating digital technologies into healthcare systems for monitoring health and tracking disease progression. As a master’s degree candidate at the Harvard Chan School of Public Health Department of Biostatistics, I have developed a multifaceted skill set of quantitative methods.","tags":null,"title":"Max Melnikas","type":"authors"},{"authors":["Max Melnikas"],"categories":null,"content":"Car Database Page Note Last Day Last day of car listings Last 7 Days Last week of car listings Last 30 Days Last month of car listings Description The AutoDev API was accessed to gather the data necessary to train the model. Toyota Camrys were used for this project due to their popularity and desirability. Cars produced before 2015 were not included. There were no mileage parameters, meaning new cars were included in the training set. Additionally, a radius of 150 miles around Boston was imposed for this search. Most (~85%) Camrys that fit this criteria fell into one of four trims: LE, SE, XLE, XSE. These four were the only ones considered for this project.\nIn the regression model, trim was expanded into dummy (indicator) variables for each trim. Mileage was included as a simple linear effect. Year was incorporated in a complex way: after converting model year into an integer age each age was further log transformed to account for the marginally decreasing depreciation rate with each additional year of age. The rationale behind this decision came from the logic that a change from ‘year 1’ to ‘year 2’ should be more significant than ‘year 6’ to ‘year 7’. Furthermore, each year was granted its own dummy (indicator) variable as well. This added flexibility was introduced to qualify how the data deviates from the assumed log year term. The final model was selected using likelihood ratio tests.\nThe next step of this project lead me into the area of automation. After generating a script that made a new API call and ran it through the pre-trained regression model. This step produced an estimated price of the car. Comparison of this price with the actual price of the vehicle lead to discount prices. The automation step of this project was a bit tricky but was accomplished with GitHub Actions and YAML. By using YAML files, I enabled an automatic trigger for my API script to run every day. Following this, another script pushed these change to this website.\n","date":1724803200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1724803200,"objectID":"68621adfcee85d43c638f71b71cf1f94","permalink":"https://mmel099.github.io/project/used-car-finder/","publishdate":"2024-08-28T00:00:00Z","relpermalink":"/project/used-car-finder/","section":"project","summary":"An automated pipeline that gathers daily car listing data through an API and notifies of cars with discounts.","tags":["Regression"],"title":"Car Market Analysis: Finding High Discount Listings","type":"project"},{"authors":null,"categories":null,"content":"The Hodgkin-Huxley model stands as a milestone in neuroscience, offering a comprehensive framework to comprehend the intricate dynamics of neuronal excitability. Developed by Sir Alan Lloyd Hodgkin and Sir Andrew Fielding Huxley in 1952, this model has been pivotal in shaping our understanding of how neurons generate and propagate electrical signals. In this blog post, we embark on a journey into the depths of the Hodgkin-Huxley neural model, exploring its inception, key components, and its profound impact on neuroscience.\nThe Birth of Hodgkin-Huxley Model: The Hodgkin-Huxley model emerged as a result of pioneering experiments conducted on the giant axon of the squid. Hodgkin and Huxley meticulously studied the ion currents across the neuronal membrane, aiming to unravel the mechanisms underlying action potentials. Their groundbreaking work, which earned them the Nobel Prize in Physiology or Medicine in 1963, laid the foundation for the Hodgkin-Huxley model.\nUnderstanding Neuronal Excitability: The Hodgkin-Huxley model provides a mathematical description of the electrical activity in neurons. At its core, the model is based on the principles of ion channels and the flow of ions across the cell membrane. Sodium (Na+), potassium (K+), and leak channels play pivotal roles in shaping the dynamics of action potentials.\nComponents of the Hodgkin-Huxley Model: Membrane Potential (Vm): The electric potential difference across the neuronal membrane. It is a dynamic variable influenced by ion currents and governs the excitability of the neuron.\nVoltage-Gated Ion Channels (Na+, K+): Membrane structures that facilitate the transport of ions based on the membrane potential\nSodium channels are responsible for the rapid influx of sodium ions during depolarization Potassium channels mediate the outward flow of potassium ions during repolarization Leak Channels: Contribute to the resting membrane potential by allowing a continuous, passive flow of ions along the concentration gradient. While not as selective as voltage-gated channels, leak channels help maintain the baseline membrane potential.\nCurrent ( $I_{\\text{app}}, I_{\\text{Na}}, I_{\\text{K}}, I_{\\text{L}}$): Movement of charged particles across a membrane. In this mode, there are four sources of current: Iapp which is externally applied to the neuron and the three currents produced by movement of ions through the sodium, potassium and leak channels described above.\nGating Variables (m, h, n): Gating variables represent the activation and inactivation states of sodium and potassium channels.\nm represents the activation of sodium channels. h represents the inactivation of sodium channels. n represents the activation of potassium channels. Transition Rate Constanst (α, β): These factors guide the dynamics of ion channel state changes.\nα is the number of times per second that a gate which is in the shut state opens β is the number of times per second that a gate which is in the open state shuts Mathematical Formulation: The model’s equations involve differential equations that use Euler’s method to describe the changes in membrane potential and ion concentrations over time. These equations elegantly capture the complex interplay between ion channels, enabling simulations of action potentials under various conditions.\nMembrane Potential Equation $$ C_m \\frac{dV}{dt} = I_{\\text{app}} + I_{\\text{Na}} + I_{\\text{K}} + I_{\\text{L}} $$ Sodium Current Equation $$ I_{\\text{Na}} = g_{\\text{Na}}m^3h(E_{\\text{Na}} - V_m) $$ Potassium Current Equation $$ I_{\\text{K}} = g_{\\text{K}}n^4(E_{\\text{K}} - V_m) $$ Leak Current Equation $$ I_{\\text{L}} = g_{\\text{L}}(E_{\\text{L}} - V_m) $$ Gating Variables Equations $$ \\frac{dm}{dt} = \\alpha_m(1 - m) - \\beta_mm $$ $$ \\frac{dh}{dt} = \\alpha_h(1 - h) - \\beta_hh $$ $$ \\frac{dn}{dt} = \\alpha_n(1 - n) - \\beta_nn $$ Alpha $(\\alpha)$ and Beta $(\\beta)$ Formulas \\begin{align*} \\text{For } m: \u0026amp; \\quad \\alpha_m = \\frac{10^5 (-V_m - 0.045)}{\\exp(100 (-V_m - 0.045)) - 1} \\\\ \u0026amp; \\quad \\beta_m = 4 \\times 10^3 \\exp\\left(\\frac{-V_m - 0.070}{0.018}\\right) \\\\ \\\\ \\text{For } h: \u0026amp; \\quad \\alpha_h = 70 \\exp(50 (-V_m - 0.070)) \\\\ \u0026amp; \\quad \\beta_h = \\frac{10^3}{\\exp(100 (-V_m - 0.040)) + 1} \\\\ \\\\ \\text{For } n: \u0026amp; \\quad \\alpha_n = \\begin{cases} 100 \u0026amp; \\text{if } V_m = -0.060 \\\\ \\frac{10^4 (-V_m - 0.060)}{\\exp(100 (-V_m - 0.060)) - 1} \u0026amp; \\text{otherwise} \\end{cases} \\\\ \u0026amp; \\quad \\beta_n = 125 \\exp\\left(\\frac{-V_m - 0.070}{0.08}\\right) \\\\ \\end{align*} Note: The if statement applies L\u0026#39;Hôpital\u0026#39;s rule to calculate $\\alpha_n$. function HodgkinHuxley() % This is a famous neuron model developped by Hodgkin and Huxley % There are four time-dependent variables: sodium activation, sodium inactivation, % potassium activation, and membrane potential % Default model parameters Gl = 3*10^-8; % Leak Conductance (S) Gna = 1.2*10^-5; % Maximum Sodium Conductance (S) Gk = 3.6*10^-6; % Maximum Delayed Rectifier Conductance (S) Ena = 4.5*10^-2; % Sodium Reversal Potential (V) Ek = -8.2*10^-2; …","date":1704499200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1704499200,"objectID":"0958a9b54bf04effc1838028b70a7ec6","permalink":"https://mmel099.github.io/post/hodgkin-huxley/","publishdate":"2024-01-06T00:00:00Z","relpermalink":"/post/hodgkin-huxley/","section":"post","summary":"Simulating Neural Activity using the Hodgkin-Huxley Model","tags":null,"title":"Hodgkin-Huxley Model","type":"post"},{"authors":null,"categories":null,"content":"Clustering is a fundamental concept in machine learning, and one of the go-to algorithms for partitioning data into distinct groups is the K-Means algorithm.\nUnderstanding the Basics: K-Means is a centroid-based clustering algorithm designed to partition a dataset into K clusters. The “K” in K-Means represents the predetermined number of clusters the algorithm aims to identify. It’s an unsupervised learning method, meaning it does not require labeled training data to function.\nHow It Works: Initialization: K-Means begins by selecting K initial centroids, one for each cluster. These centroids can be randomly chosen data points or strategically placed based on some prior knowledge.\nAssigning Data Points to Clusters: Each data point is assigned to the cluster whose centroid is closest to it. Proximity is typically measured using Euclidean distance, but other distance metrics can be employed based on the nature of the data.\nUpdating Centroids: After the initial assignments, the centroids are recalculated by taking the mean of all the data points within each cluster. This step iterates until the centroids no longer change significantly or a predetermined number of iterations is reached.\nConvergence: The algorithm converges when the centroids stabilize, and the assignments remain unchanged between iterations.\n# This is a K-Means algorithm implemented from scratch # It initializes k random centroids and assigns points to clusters using euclidean distance # With each iteration, centroids are recaluclated based on the mean of data points assigned to that cluster # If a smaller euclidean distance to another centroid is identified, the data point is reassigned to the other cluster # The algorithm will converge when centroids become statatic or after 100 iterations, whichever happens first # Arguments: k is predetermined number of clusters; X is input data # Returns: centroids are values of cluster centersl labels are cluster assignments for each data point import numpy as np import pandas as pd def KMeansFromScratch(k, X): centroids = X[np.random.choice(range(X.shape[0]), k, replace=False)] for i in range(100): prev_centroids = centroids distances = np.concatenate([np.linalg.norm(X - centroid, axis=1).reshape(-1, 1) for centroid in centroids],axis=1) labels = np.argmin(distances, axis=1) centroids = pd.DataFrame(np.concatenate([X, labels.reshape(-1, 1)], axis=1)).groupby(2).mean().values if np.array_equal(centroids, prev_centroids): return centroids, labels return centroids, labels Wikipedia Author: Chire\nApplications of K-Means: Image Compression: K-Means can be used to reduce the number of colors in an image, effectively compressing it while preserving its essential features.\nCustomer Segmentation: Businesses utilize K-Means to categorize customers based on purchasing behavior, allowing for targeted marketing strategies.\nAnomaly Detection: K-Means can identify outliers or anomalies in datasets by grouping normal data points into clusters and isolating those that deviate.\nGenetic Clustering: In biological research, K-Means helps classify genes with similar expression patterns, aiding in the identification of gene functions.\nChallenges and Considerations: Sensitivity to Centroid Initialization: K-Means results can vary based on the initial placement of centroids, and multiple runs with different initializations may be needed to find the optimal solution.\nAssumption of Spherical Clusters: The algorithm assumes that clusters are spherical and equally sized, which may not be suitable for all types of data.\nPossibility of Empty Clusters: As the algorithm iterates to convergence, there is a chance that all data points get reassigned from a cluster, leaving it empty. A common solution is to choose a random point to act as a new centroid if any empty clusters are detected.\n# This is a comprehensive K-Means Algorithm using the sklearn package # Outliers are identified using isolation forest and removed # Model fit is evaluated using normalized mutual information by comparing predicted labels to ground truth labels # Ten runs of K-Means are completed and the model with the highest NMI score is selected # Arguments: k is predetermined number of clusters; X is input data; true_labels are ground truth labels # Returns: final_centroids are values of cluster centers; final_labels are cluster assignments for each data point from sklearn.cluster import KMeans from sklearn.ensemble import IsolationForest from sklearn.metrics import normalized_mutual_info_score import numpy as np def ComprehensiveKMeans(k, X, true_labels) isolation_forest = IsolationForest() isolation_forest.fit(X) outliers = isolation_forest.predict(X) X = X[outliers == 1] kmeans = KMeans(n_clusters=k) best_score = -1 best_model = None for i in range(10): kmeans.fit(X) labels = kmeans.predict(X) score = normalized_mutual_info_score(labels, true_labels) if score \u0026gt; best_score: best_score = score best_model = kmeans final_labels = best_model.predict(X) final_centroids = …","date":1704153600,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1704153600,"objectID":"49862c2d07e897408721feb14dec3c05","permalink":"https://mmel099.github.io/post/kmeans/","publishdate":"2024-01-02T00:00:00Z","relpermalink":"/post/kmeans/","section":"post","summary":"Introduction to K-Means Clustering","tags":null,"title":"K-Means Algorithm","type":"post"},{"authors":["Max Melnikas"],"categories":null,"content":"Abstract In the realm of psychology and neuroscience, understanding human experiences and emotions through word usage can present a fascinating and diﬀicult challenge. Words choice can be highly person and context-dependent. However, with a large enough sample of written answers to a single question prompt, we may be able to identify certain trends in word usage.\nIn this project, I used a dataset (X. Alice Li and Devi Parikh, 2019) that contains a large number (N = 1473) of written responses to the question: “What were salient aspects of your day yesterday? How did you feel about them?”. Additionally, each response is labelled with one or more emotion from an exhaustive list of 18 different emotions.\nIn my analysis, I attempted to find associations between frequent words that participants included in their responses and the emotions these responses were labelled with. Next, I explored word co-occurence and the associations of word pairs to emotions. Certain trends in word usage and emotional labels were identified throughout the analysis, however, an issue with the sample size was also noted.\n","date":1702944000,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1702944000,"objectID":"953cf47cba974eb597bec69b1719b3fa","permalink":"https://mmel099.github.io/project/pathos/","publishdate":"2023-12-19T00:00:00Z","relpermalink":"/project/pathos/","section":"project","summary":"Analyzing text patterns using the Apriori algorithm and drawing associations with emotions using logistic regression","tags":["Classification","Clustering"],"title":"Pathos: Associations of Word Usage and Emotions","type":"project"},{"authors":["Max Melnikas","Alex Mellott","Naveen Pednekar"],"categories":null,"content":"Abstract Sleep quality is a critical component of overall well-being, with numerous factors affecting its duration and depth. Among these factors, nutrition plays a pivotal yet underexplored role in regulating sleep quality. Accurately measuring an individual’s dietary intake is a fundamental challenge in nutritional research. The National Health and Nutrition Examination Survey (NHANES) is an annual survey conducted by the Centers for Disease Control and Prevention (CDC) that collects various health-related data and weights it to be nationally representative.\nThis project takes advantage of the large sample size of the NHANES dataset to draw associations between macronutrient predictors and sleep quality outcomes. Moreover, the demographic data collected through NHANES others us a way to investigate relevant confounders that are associated with both nutrition and sleep. We identified three final outcome variables related to sleep quality. One outcome was the duration of sleep, rounded to the closest half-hour, on weekdays; this outcome was modeled using multiple linear regression. Another relevant outcome was an indicator for whether the participant had ever told a doctor about trouble sleeping; this was modeled using multinomial regression. The final outcome was a categorical variable asking how often a participant felt overly sleepy during the past month. Furthermore, we aggregated our three sleep outcomes into a single overall metric of sleep quality and fit a Quasi-Poisson regression model.\nFiber intake was found to be positively associated with sleep quality, across linear, multinomial, and Quasi-Poisson regressions. Protein was found to have a negative association with length and quality of sleep across the Quasi-Poisson and linear models. Carbohydrates were found to have a harmful effect on sleep quality in the adjusted multinomial models.\n","date":1702166400,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1702166400,"objectID":"d606de4324901fcb49244859d1ca98e3","permalink":"https://mmel099.github.io/project/sweet-dreams/","publishdate":"2023-12-10T00:00:00Z","relpermalink":"/project/sweet-dreams/","section":"project","summary":"Exploring various sleep quality endpoints with multiple, logistic, multinomial and Poisson regressions","tags":["Regression","Classification"],"title":"Sweet Dreams: A Regression Analysis of Macronutrient Intake and Sleep Quality","type":"project"},{"authors":[],"categories":null,"content":"I created a video for my final presentation for a course in linear algebra at Brandeis University. The assignment required a tutorial video on a topic related to something covered in class. My presentation outlined the principal component analysis (PCA) method and its relationship to eigenvectors and eigenvalues.\n","date":1670587200,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1670587200,"objectID":"a8edef490afe42206247b6ac05657af0","permalink":"https://mmel099.github.io/talk/principal-component-analysis/","publishdate":"2022-12-09T12:00:00Z","relpermalink":"/talk/principal-component-analysis/","section":"event","summary":"A brief tutorial on PCA and its relation to eigenvectors","tags":[],"title":"Principal Component Analysis","type":"event"},{"authors":null,"categories":["R"],"content":" R Markdown This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.\nYou can embed an R code chunk like this:\nsummary(cars) ## speed dist ## Min. : 4.0 Min. : 2.00 ## 1st Qu.:12.0 1st Qu.: 26.00 ## Median :15.0 Median : 36.00 ## Mean :15.4 Mean : 42.98 ## 3rd Qu.:19.0 3rd Qu.: 56.00 ## Max. :25.0 Max. :120.00 fit \u0026lt;- lm(dist ~ speed, data = cars) fit ## ## Call: ## lm(formula = dist ~ speed, data = cars) ## ## Coefficients: ## (Intercept) speed ## -17.579 3.932 Including Plots You can also embed plots. See Figure 1 for example:\npar(mar = c(0, 1, 0, 1)) pie( c(280, 60, 20), c(\u0026#39;Sky\u0026#39;, \u0026#39;Sunny side of pyramid\u0026#39;, \u0026#39;Shady side of pyramid\u0026#39;), col = c(\u0026#39;#0292D8\u0026#39;, \u0026#39;#F7EA39\u0026#39;, \u0026#39;#C4B632\u0026#39;), init.angle = -50, border = NA ) Figure 1: A fancy pie chart. ","date":1606875194,"expirydate":-62135596800,"kind":"page","lang":"en","lastmod":1606875194,"objectID":"bf1eb249db79f10ace7d22321494165a","permalink":"https://mmel099.github.io/post/2020-12-01-r-rmarkdown/","publishdate":"2020-12-01T21:13:14-05:00","relpermalink":"/post/2020-12-01-r-rmarkdown/","section":"post","summary":"R Markdown This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.","tags":["R Markdown","plot","regression"],"title":"Hello R Markdown","type":"post"}]