Header

SIAM Undergraduate Research Online

Volume 12

SIAM Undergraduate Research Online Volume 12

Computing shape DNA using the closest point method

Published electronically January 13, 2019
DOI: 10.1137/18S016801

Authors: Rachel Han (University of British Columbia) and Chingyi Tsoi (Hong Kong Baptist University)
Sponsor: Colin Macdonald (University of British Columbia)

Abstract: We demonstrate an application of the closest point method to numerically computing the truncated spectrum of the Laplace-Beltrami operator. This is known as the “Shape DNA" and it can be used to identify objects in various applications. We prove a result about the null-eigenvectors of the numerical discretization. We also investigate the effectiveness of the method with respect to invariants of the Shape DNA. Finally we experiment with clustering similar objects via a multi-dimensional scaling algorithm.

Opinion Formation Dynamics with Contrarians and Zealots

Published electronically February 12, 2019
DOI: 10.1137/18S017314

Author: Kaitlyn Eekhoff (Calvin College)
Sponsor: Todd Kapitula (Calvin College)

Abstract: Mean-field type ODE models for opinion dynamics often assume that the entire population is comprised of congregators, who are agreeable. On the other hand, a contrarian opinion dynamics ODE model assumes the population has two personality types: congregators, and contrarians, who are disagreeable. In this paper we broadly study how contrarians influence the ability of the population to form a fixed and stable opinion. In particular, we re-examine the dynamics associated with the model introduced by Tanabe and Masuda [12] by looking at how the parameters effect the formation of stable periodic solutions (whose existence implies there is no fixed consensus opinion). Afterwards, we refine and analyze the model under two new hypotheses: (a) the contrarians bow to peer pressure and change their personality type to congregators if a large enough proportion of the entire population agrees on an opinion, and (b) there are zealots associated with one of the opinions. We conclude with a brief discussion on possible extensions of this work.

A Bayesian Model for the Prediction of United States Presidential Elections

Published electronically February 18, 2019
DOI: 10.1137/17S016166

Author: Brittany Alexander (Texas Tech University)
Sponsor: Leif Ellingson (Texas Tech University)

Abstract: Using a combination of polling data and previous election results, FiveThirtyEight successfully predicted the Electoral College distribution in the presidential election in 2008 with 98% accuracy and in 2012 with 100% accuracy. This study applies a Bayesian analysis of polls, assuming a normal distribution of poll results using a normal conjugate prior. The data were taken from the Huffington Post's Pollster. States were divided into categories based on past results and current demographics. Each category used a different poll source for the prior. This model was originally used to predict the 2016 election, but later it was applied to the poll data for 2008 and 2012. For 2016, the model had 88% accuracy for the 50 states. For 2008 and 2012, the model had the same Electoral College Prediction as FiveThirtyEight. The method of using state and national polls as a prior in election prediction seems promising and further study is needed.

A Lattice-Based Approach to the PSQ Smoking Model

Published electronically March 7, 2019
DOI: 10.1137/18S017077

Author: Shengding Sun (The University of North Carolina at Chapel Hill)
Sponsor: Nancy Rodriguez (The University of North Carolina at Chapel Hill)

Abstract: We study the dynamics of smoking behavior of agents with a stochastic lattice-based model, assuming that each agent occupies a node and is influenced by its neighbors. This mechanism is adapted from the PSQ smoking model, which is based on a system of ordinary differential equations. The difference in this model is that, more realistically, potential smokers are only influenced by nearby current smokers, instead of all smokers. In addition, the stochasticity of this model also accounts better for the randomness in real world smoking behavior. It is shown here that the quantitative estimates of this new lattice model are significantly different from the previous numerical results obtained in other works using the ODE model. This suggests that taking locality into account affects the model behavior. The critical exponents of this new lattice smoking model under von Neumann neighborhood condition are calculated and verified to be the same as the classic SIRS epidemic model, which classifies this model as belonging to the directed percolation class. We also consider the model in continuum setting, and solve the system numerically using a particular convolution kernel. To the author's knowledge this is the first time where this widely used and discussed PSQ smoking model is incorporated into the lattice-based setting, and our results show that this changes the quantitative behavior of the PSQ model significantly.

CID Models on Real-world Social Networks and Goodness of Fit Measurements

Published electronically March 7, 2019
DOI: 10.1137/18S017260

Authors: Jun Hee Kim, Eun Kyung Kwon, and Qian Sha (Carnegie Mellon University)
Sponsor: Brian Junker (Carnegie Mellon University)

Abstract: Assessing the model fit quality of statistical models for network data is an ongoing and underexamined topic in statistical network analysis. Traditional metrics for evaluating model fit on tabular data such as the Bayesian Information Criterion are not suitable for models specialized for network data. We propose a novel self-developed goodness of fit (GOF) measure, the “stratified-sampling cross-validation” (SCV) metric, that uses a procedure similar to traditional cross-validation via stratified-sampling to select dyads in the network’s adjacency matrix to be removed. SCV is capable of intuitively expressing different models’ ability to predict on missing dyads. Using SCV on real-world social networks, we identify the appropriate statistical models for different network structures and generalize such patterns. In particular, we focus on conditionally independent dyad (CID) models such as the Erdos Renyi model, the stochastic block model, the sender-receiver model, and the latent space model.

An Integro-Differential Model of Language Competition

Published electronically April 25, 2019
DOI: 10.1137/18S017363

Authors: Mallory Gaspard, Peter Craig, and Erik Bergland (Rensselaer Polytechnic Institute)
Sponsor: Peter Kramer (Rensselaer Polytechnic Institute)

Abstract: We study the language shift and competition between the twelve most prominent world-languages while accounting for factors affecting these trends such as governmental influences, migration between nations, and the interaction between competing languages. To model these effects, we propose an integro-differential equation, which is a partial differential equation (PDE), that takes the aforementioned factors into account and predicts the fate of these languages with regards to time and geography. We also carry out a stability analysis of our proposed model under certain circumstances.

In the first part of the investigation, following the establishment of our integro-differential equation model, we also construct a weighted digraph in Python using the United Nations Migrant Data from 1990-2017 to identify the geographic locations and languages that act as keystones in the global language network. In addition, we execute a numerical simulation of our PDE model in Python, to model the projected future language shifts over time and compare the results from our model to the centrality calculations carried out on our digraph. From the numerical simulations, we predict that the number of monolingual Hindustani speakers will show the greatest growth. Also in terms of the number of first language speakers, English will pass Spanish and Russian will pass Bengali. Furthermore, from our model, it is estimated that in the next fifty years, we can expect to see a rise in the number of English speakers, which will remain clear second beneath Mandarin. We can also expect to see a decrease in the number of Bengali speakers.

Global Solution to a Non-linear Wave Equation of Liquid Crystal in the Constant Electric Field

Published electronically May 13, 2019
DOI: 10.1137/18S017557

Authors: Linjun Huang (University of California, Davis)
Sponsor: Qingtian Zhang (University of California, Davis)

Abstract: We construct a global conservative weak solution to the Cauchy problem for the non-linear variational wave equation vtt-c(v)(c(v)vx)x+1/2g(v) = 0 where g(v) is defined in (2.5) and c(.) is any smooth function with uniformly positive bounded value. This wave equation is derived from a wave system modelling nematic liquid crystals in a constant electric field.

Analytical Solutions of the Susceptible-Infected-Virus (SIV) Model

Published electronically June 3, 2019
DOI: 10.1137/18S017545

Authors: Emily MacIndoe (University of Mary Washington)
Sponsor: Leo Lee (University of Mary Washington)

Abstract: The Susceptible-Infected-Virus (SIV) model is a compartmental model to describe within-host dynamics of a viral infection. We apply the SIV model to the human immunodeficiency virus (HIV); in particular, we present analytical solutions to two versions of the model. The first version includes only terms related to the susceptible cell-virus particle interaction and virus production, while the second includes those terms in addition to the infected cell death rate. An analytical solution, although more challenging and time-consuming than numerical methods, has the advantage of giving exact, rather than approximate, results. These results contribute to our understanding of virus dynamics and could be used to develop better treatment options. The approach used to solve each model involved first isolating one of the dependent variables, that is, deriving an equation that involves only one of the variables and its derivatives. Next, various substitutions were used to bring the equation to a more easily solvable form. For the first model, an exact solution is obtained in the form of an implicit equation. For the second model, we give an analytical solution generated by an iterative method.

Substance Use and Abuse

Published electronically June 7, 2019
DOI: 10.1137/19S1259870
M3 Challenge Introduction

Authors: Eric Chai, Gustav Hansen, Emily Jiang, Kylie Lui, and Jason Yan (High Technology High School, Lincroft, NJ)
Sponsor: Raymond Eng (High Technology High School, Lincroft, NJ)

Abstract: In recent years, substance abuse has intensified to an alarming degree in the United States. In particular, the rise of vaping, a new form of nicotine consumption, is dangerously exposing drug abuse to a new generation. With the need to understand how substance use spreads and impacts individuals differently, our team seeks to provide a report with mathematically-founded insights on this prevalent issue.

The repercussions of substance abuse are reverberating and remain with an individual for life. However, drugs not only severely affect the user but also cause extensive societal harm. Increased understanding of the projected spread and impact of substance abuse, as well as the underlying factors that lead to poor judgement, are needed to optimize measures to restrict consumption. Ultimately, we believe that our models provide novel insight into the nationwide issue of substance use and abuse.

Limitations of Richardson Extrapolation for Kernel Density Estimation

Published electronically June 27, 2019
DOI: 10.1137/18S01757

Author: Ruben Ascoli (Thomas Jefferson High School for Science and Technology)
Sponsor: Tyrus Berry (George Mason University)

Abstract: This paper develops the process of using Richardson Extrapolation to improve the Kernel Density Estimation method, resulting in a more accurate (lower Mean Squared Error) estimate of a probability density function for a distribution of data in Rd given a set of data from the distribution. The method of Richardson Extrapolation is explained, showing how to ﬁx conditioning issues that arise with higher-order extrapolations. Then, it is shown why higher-order estimators do not always provide the best estimate, and it is discussed how to choose the optimal order of the estimate. It is shown that given n one-dimensional data points, it is possible to estimate the probability density function with a mean squared error value on the order of only n−1 ln(n). Finally, this paper introduces a possible direction of future research that could further minimize the mean squared error.

Survival Analysis of Young Leukemia Patients

Published electronically July 8, 2019
DOI: 10.1137/19S019085

Authors: Theren Williams, Zachary Smith, and Drew Seewald (University of Michigan, Dearborn)
Sponsors: Dr. Keshav P. Pokhrel and Dr. Taysseer Sharaf (University of Michigan, Dearborn)

Abstract: With cancer as a leading cause of death in the United States, the study of its related data is imperative due to the potential patient benefits. This paper examines the Surveillance, Epidemiology, and End Results program (SEER) research data of reported cancer diagnoses from 1973-2014 for the incidence of leukemia in young (0-19 years) patients in the United States. The aim is to identify variables, such as prior cancers and treatment, with a unique impact on survival time and five-year survival probabilities using visualizations and different machine learning techniques. This goal culminated in building multiple models to predict the patient's hazard. The two most insightful models constructed were both neural networks. One network used discrete survival time as a covariate to predict one conditional hazard per patient. The prediction rate is nearly 95% for testing datasets. The other network built hazards for discrete time intervals without survival time as a covariate and predicted with lower accuracy, but captured variable effects from initial testing better.

Do two parties represent the US? Clustering analysis of US public ideology survey

Published electronically July 9, 2019
DOI: 10.1137/17S016518

Authors: Louisa Lee and Siyu Zhang (Northwestern University)
Sponsor: Vicky Chuqiao Yang (Northwestern University)

Abstract: Recent surveys have shown that an increasing portion of the US public believes the two major US parties adequately represent the US public opinion and think additional parties are needed [1]. However, there are high barriers for third parties in political elections. In this paper, we aim to address two questions: “How well do the two major US parties represent the public’s ideology?” and “Does a more-than-two-party system better represent the ideology of the public?”. To address these questions, we utilize the American National Election Studies Time series dataset [2]. We perform unsupervised clustering with Gaussian Mixture Model method on this dataset. When clustered into two clusters, we find a large centrist cluster and a small right-wing cluster. The Democratic Party’s position (estimated using the mean position of the individuals self-identified with the parties) is similar to that of the centrist cluster, and the Republican Party’s position is between the two clusters. We investigate if more than two parties represent the population better by comparing the Akaike Information Criteria for clustering results of the various number of clusters. We find that additional clusters give a better representation of the data, even after penalizing for the additional parameters. This suggests a multiparty system represents of the ideology of the public better.

Fast implementation of mixed RT0 finite elements in MATLAB

Published electronically July 18, 2019.
DOI: 10.1137/18S017430

Author: Theodore Weinberg (University of Maryland, Baltimore County)
Sponsor: Bedrich Sousedik (University of Maryland, Baltimore County)

Abstract: We develop a fast implementation of the mixed finite element method for the Darcy's problem discretized by lowest-order Raviart-Thomas finite elements using Matlab. The implementation is based on the so-called vectorized approach applied to the computation of the finite element matrices and assembly of the global finite element matrix. The code supports both 2D and 3D domains, and the finite elements can be triangular, rectangular, tetrahedral or hexahedral. The code can also be easily modified to import user-provided meshes. We comment on our freely available code and present a performance comparison with the standard approach.

An Adaptive, Highly Accurate and Efficient, Parker-Sochacki Algorithm for Numerical Solutions to Initial Value Ordinary Differential Equation Systems

Published electronically August 14, 2019
DOI: 10.1137/19S019115

Authors: Jenna Guenther and Morgan Wolf (James Madison University)
Sponsor: Dr. Paul Warne (James Madison University)

Abstract: The Parker-Sochacki Method (PSM) allows the numerical approximation of solutions to a polynomial initial value ordinary differential equation or system (IVODE) using an algebraic power series method. PSM is equivalent to a modified Picard iteration and provides an efficient, recursive computation of the coefficients of the Taylor polynomial at each step. To date, PSM has largely concentrated on fixed step methods. We develop and test an adaptive stepping scheme that, for many IVODEs, enhances the accuracy and efficiency of PSM. PSM Adaptive (PSMA) is compared to its fixed step counterpart and to standard Runge-Kutta (RK) foundation algorithms using three example IVODEs. In comparison, PSMA is shown to be competitive, often outperforming these methods in terms of accuracy, number of steps, and execution time. A library of functions is also presented that allows access to PSM techniques for many non-polynomial IVODEs without having to first rewrite these in the necessary polynomial form, making PSM a more practical tool.

Utilization of Machine Learning to Simulate the Implementation of Instant Runoff Voting

Published electronically August 22, 2019
DOI: 10.1137/18S016709

Author: Nicholas Joyner (East Tennessee State University)
Sponsor: Michele Joyner (East Tennessee State University)

Abstract: In election years when the popular vote winner and Electoral College winner differ, such as in the 2016 presidential election, there tends to be an increase in discussions about alternative voting strategies. Ranked choice voting is a strategy that has been discussed and is currently used in approximately fourteen cities across the United States and in six states for special elections and overseas ballots. Ranked choice voting (RCV), sometimes called Instant Runoff Voting (IRV), is a system of voting in which voters are allowed to rank the candidates. If no candidate wins over fifty percent of the vote, the election automatically goes to another round. The candidate with the least support is eliminated and their votes are redistributed to the voters' next choice. This process of elimination and redistribution continues until a candidate receives a majority of the vote. In this paper, we use predictive modeling strategies and simulation to investigate the potential implications of employing ranked choice voting in a presidential election using the 2016 presidential election as a case study.

Parameter and Uncertainty Estimation for a Model of Atmospheric CO2 Observations

Published electronically September 3, 2019
DOI: 10.1137/18S017533

Authors: Aimee Maurais and Arianna Krinos (Virginia Tech)
Sponsor: Matthias Chung (Virginia Tech)

Abstract: In this project, we deduce and analyze a mathematical model for atmospheric carbon dioxide concentrations during the time period from 1958 to 2018, as observed by NOAA at the Mauna Loa Observatory. We approximate atmospheric CO2 during this period using a linear combination of a constant to represent atmospheric carbon dioxide concentration at the beginning of the modeled period, a sinusoidal function to capture annual seasonal variation in carbon dioxide concentrations, and an exponential component to capture the observed increase in global carbon dioxide concentration in the atmosphere from the Mauna Loa dataset. Using Bayesian inference methods, we estimate parameters for our model via a Markov Chain Monte Carlo method, the Adaptive Metropolis algorithm. We present distributions for each of six important model parameters, and present predictive intervals for projected increases in atmospheric CO2 concentration for the period from 2018 to 2120. We find that CO2 concentrations can be predicted reasonably well using our modeling approach, and suggest that our framework be used as an adaptable, extensible method of finding good approximations with low variances for data of this type.

Analysis of an Antimicrobial Resistance Transmission Model

Published electronically September 10, 2019
DOI: 10.1137/19S1254805

Author: John Sangyeob Kim (Pomona College)
Sponsor: Adolfo J. Rumbos (Pomona College)

Abstract: We present an analysis of a system of differential equations that models the transmission dynamics of pathogens with antimicrobial resistance (AMR) in an intensive care unit (ICU) studied by Austin and Anderson (1999). In Austin and Anderson's four-dimensional compartmental model, patients and health care workers are viewed as hosts and vectors of the pathogens, respectively, and subdivided into uncolonized and colonized populations. In the analysis, we reduce the model to a two-dimensional non-autonomous system. Noting that the reduced system has an autonomous limiting system, we then apply the theory of symptotically autonomous differential equations systems in the plane developed by Markus (1956) and extended by Thieme (1992, 1994), and later by Castillo-Chavez and Thieme (1995).

We first present a stability analysis of the limiting system and prove the existence of a locally asymptotically stable equilibrium point under a set of constraints expressed in terms of reproductive numbers. We then proceed to an asymptotic analysis of the non-autonomous, two-dimensional system by applying a Poincaré-Bendixson type trichotomy result proved by Thieme (1992, 1994). In particular, we establish that any forward bounded trajectory of the non-autonomous system that starts within a defined rectangular region will converge toward the equilibrium point of the limiting system, provided that certain conditions given in terms of the reproductive numbers are satisfied.

Solving the Dirac Equation with the Unified Transform Method

Published electronically September 13, 2019
DOI: 10.1137/19S1257925

Author: Casey Garner (Rose-Hulman Institute of Technology)
Sponsor: William Green (Rose-Hulman Institute of Technology)

Abstract: In this article we use the Unified Transform Method to study boundary value problems for a hyperbolic system of partial differential equations from relativistic quantum mechanics. Specifically, we derive solutions to the Dirac equation in both the massive and massless cases on the half-line and the finite interval using this method.

A Comparison of Machine Learning Approaches to Housing Value Estimation.

Published electronically November 18, 2019
DOI: 10.1137/18S017296

Author: Orton Babb (George Mason University)
Sponsor: Igor Griva (George Mason University)

Abstract: Housing value estimation relies on hedonic pricing models whereby price is determined by both internal characteristics (bedrooms, bathrooms, living area, etc.) as well as external characteristics (neighboring houses, ZIP code, etc.). While classical parametric models based on linear regression analysis have been well studied in this application, the theory of hedonic prices places no restrictions on the hedonic price functional form, and hence, more recent research has attempted to apply machine learning (ML) approaches such as K-Nearest Neighbors and Support Vector Machine Regression (SVR). Many of these ML methods are employed on the basis of their flexibility in terms of making less assumptions on the shape or distribution of the data. ML models are therefore used with the expectation of higher accuracy on predicting the final sale price of a house. In this study, we consider the combination of various pre-processing procedures and candidate models on a historical data set of house sales in King County, Washington. Different measures of accuracy are considered in interpreting model performance. The results suggest that while machine learning algorithms like SVR achieve top performance as measured by the adjusted R2, classical parametric models can also achieve out-of-sample generalization nearing that of the more sophisticated ML models, with faster training times, no need for feature scaling and more easily interpreted parameters.

Using Mathematical Models to Rank the Members of Criminal Networks

Published electronically December 5, 2019
DOI: 10.1137/17S016592

Author: Lucas Chirino (High Point University)
Sponsor: Laurie Zack (High Point University)

Abstract: Different mathematical approaches to ranking were used to determine the level of importance of every member in a criminal network. Two different data sets consisting of phone records provided by the FBI were analyzed. The first data set consisted of call logs over a three year span of members in a drug ring. The second data set consisted of the call logs of members in a gang. After analyzing the results of the rankings, properties of the two networks are discussed to provide further insight into why certain ranking algorithms performed well and others did not.