CS Night 2016
Following are some of the student projects presented at USF CS Night 2016. Projects were presented from the Masters and Senior Project courses, the Bioinformatics course, and from some faculty-sponsored projects .
Using Animation to Alleviate Overdraw in Multi-class Scatterplot Matrices
Presenter: Helen Chen (hhchen@dons.usfca.edu)
Faculty Advisors: Profs.Sophie Engle and Alark Joshi
Abstract:
Scatterplots are a widely-used technique for visualizing multivariate datasets. Even though scatterplots play an important role in data visualization,
they have known issues with overdraw. Overdraw occurs when points or glyphs are drawn on top of each other and obscure the underlying data.
Overdraw affects the ability of viewers to correctly understand the data distribution and discern relationships among subgroups of the data.
There are a variety of techniques for alleviating overdraw, none of which involve animation. Our research aims to use animation to visualize
multidimensional data for multi class scatterplot matrices and compare its efficacy in alleviating overdraw against that of other techniques.
Student Record Verification App - A Decentralized Application
Presenters: Mayank Thirani, Ryan Zhu, Jakob Tarnow
Sponsor: Jim Huang
Faculty Advisors: CS 690 Master's Project, Prof. Olga Karpenko
Abstract:
The Student Record Verification App will utilize the characteristics of Blockchain to solve the trust problem between recruiters from human resource
departments around the world needing to verify the applicants' claim of authenticity of their education degree without relying on intermediary third party
making that verification. Each transaction in the Blockchain is verified by consensus of a majority of the participants in the network. The Blockchain contains
all verifiable student records and past transactions. Allowing a recruiter to validate that a student’s educational background matches that of it’s respective
registrar via Blockchain, removes the need for a central entity and leads to faster attestation of student records.
Ten-X Hackday Tool
Presenters: Jeremiad Raymond, Teng Hu, Yi Xiao
Sponsor: Jon Rahoi
Faculty Advisor: CS490 Senior Project, Prof. Jeffrey Johnson
Abstract:
Ten-X Hackday Tool is an online web application designed as a platform to our sponsor company (Ten-X) for an annual programming competition
called “Hackday”. This platform will be used to collect, store, and process data entered by participants and graders. The purpose of this project is to
provide a better platform for Hackday organizers to control the flow of Hackday by handling the most repetitive tasks.
Sudokil
Presenter: Dominic Mortlock
Sponsor: The client would be students interested in learning more about unix and scripting concepts. Also, the game aims to be a fun challenge that people can practice both their unix knowledge and their problem solving skills.
Faculty Advisor: Prof. David Wolber
Abstract:
Sudokil is a hacking/scripting themed puzzle game about using Unix-like commands on a terminal to control computers, robots, and various other devices.
Progress through levels and get access to different puzzle elements while collecting more scripts, permissions and tools.
Customer Ticket Classification Engine - Applying Machine Learning Algorithms to SnapLogic Metadata
Presenters: Min Chen, Shiyi Tan
Sponsor: Prof. Gregory Benson
Faculty Advisor: CS690 Master's Project, Prof. Karpenko
Abstract:
SnapLogic customer service team needs to prioritize customer tickets and measure customer satisfaction, which was previously done manually and
was very time consuming. To automate this process, we built two engines, one for prioritizing tickets and one for the sentiment analysis of customer
comments. We first analyzed the ticket data and fit the two models, then used the models to predict the ticket priority and the sentiment of the comment
(“neutral” vs “negative”). If the ticket is labeled as “high priority” or contains a negative comment, our system sends an alert to the customer support team.
That allows the team to handle interactions with customers more wisely and saves their time.
Snap Recommendation Engine
Presenter: Thanawut Ananpiriyakul
Sponsor: Prof. Gregory Benson
Faculty Advisor: CS690 Master's Project, Prof. Karpenko
Abstract:
SnapLogic has been providing data integration services for years. A snap is a pre-built component that performs an operation on data. A pipeline is a
graph (DAG) of snaps which executes a specific task. In order to successfully build a pipeline, the user needs to select the right snap and connect it correctly
to the previous snap. For this project, we built the engine that recommends the most likely next snaps to users. We achieved 88% hit rate in the final prototype
implemented in Python. It means that 88% of time "deciding on the type of snap + searching for it among 100 types of snaps + dragging and dropping it to
canvas" will be reduced to "1 click.”
My Smart Financial Advisor - A Mobile Application for Mutual Fund Investment Management
Presenters: Richard Wang, Chen-Ning Chi, Kaynat Quayyum
Sponsor: Stephen Y. Pak, The Core Group
Faculty Advisor: CS690 Master's Project, Prof. Olga Karpenko
Abstract:
Mutual fund investment currently makes up a vast proportion of the retirement assets for Americans. At the same time, as mobile devices attain increasing
capabilities and popularity, more people switch from PC to mobile devices such as tablet computers and smartphones. We provide a platform to buy and sell
mutual fund shares on both iOS and Android devices. This enables users to manage mutual fund investment anywhere and anytime. Our application is
implemented in C# using Xamarin that allows us to build iOS and Android apps from a single shared codebase. Our app provides a good user experience
with high level of security.
An Exploration of Single Nucleotide Polymorphisms on Type 2 Diabetes Outcome
Presenters: Michael Totagrande, Irina Popova
Sponsor: Sean Kimbro, North Carolina Central University and La Creis Kidd from University of Louisville
Faculty Advisor: CS640 Bioinformatics, Professor Patricia Francis-Lyon
Abstract:
Type 2 Diabetes (T2D) affects millions and is characterized by the inability to produce enough insulin, resulting in improper glucose regulation. With
numerous direct risk factors, including increased body mass index (BMI), race, high blood pressure, and the presence or absence of certain single
nucleotide polymorphisms (SNPs), T2D is a complicated disease. Herein, we explore over 600,000 SNP frequencies for more than 2000 individuals
in order to determine their impact on T2D outcome.
Using Face Tracking for Computationally-Efficient Visualization of Large Vector Data
Presenter: Thanawut Ananpiriyakul
Faculty Advisor: Prof. Alark Joshi
Abstract:
Visualizing large vector data is computationally expensive. Given that human beings can only visualize a certain region of a screen at a time, we have developed
a novel face tracking-based technique for visualization of large vector data. This focus+context visualization of vector fields reduces visual clutter and helps the
user visualize features of interest. We chose to use streamline and glyph-based methods to represent the vector data. Users can interact with the data in real time,
choosing regions of interest through a mouse, a touch interface, or their face. The presented visualization technique results in frame rate that is almost 5 times
higher than the full detail visualization of vector data.
Exploring Leap Motion for Intuitive Interaction of Scientific Data
Presenter: Shiyi Tan
Faculty Advisor: Directed Study, Prof. Alark Joshi
Abstract:
We explore use of the Leap Motion with intuitive interaction of medical data, trying to help practitioners interact with large, high-resolution datasets.
We use VTK for the visualization pipeline that includes data processing, surface extraction/volume rendering, and basic user interaction. We facilitate
freeform interaction without the use of a mouse and keyboard using the Leap Motion. With the Leap Motion controller, users can explore the 3D data
and perform basic interaction such as rotation, translation, and zooming in.
Computational Enzymology
Presenters: Stephanie Martin, Meriam Vejiga, Adrian Ramirez
Sponsor: Distributed Bio is a an antibody discovery, engineering, informatics and services company focused on producing next generation antibody libraries and revolutionary vaccines.
Faculty Advisor: CS640 Bioinformatics, Professor Patricia Francis-Lyon
Abstract:
Enzymes are the original organic chemists (OCC), capable of catalyzing a wide variety of reactions that have great therapeutic potential. Many enzymes
have been cataloged and annotated using the Gene Ontology, a gene annotator's reference, and categorized by the Enzyme Commission, a database that
classifies enzymes based on the nature of their enzymatic activity. We took advantage of these two databases to mine for homology groups with similar
enzymatic activity, but different substrates. We characterized these enzyme groups by sequence variability and enzymatic variability. This work provides a
foundation for the creation of a new class of enzyme replacement therapy and for the creation of a new generalized synthesis technology.
Mechanistic Indicators of Childhood Asthma
Presenter: Stephanie Styx
Sponsor: Dr. ClarLynda Williams-DeVane from North Carolina Central University sponsored the mechanistic indicators of childhood asthma project.
Her objective for this project is to identify key environmental exposure contributors to asthma subtypes of varying severity.
Faculty Advisor: CS640 Bioinformatics, Professor Patricia Francis-Lyon
Abstract:
Understanding the relationship between environmental factors and their affect on asthmatic children. Through principal component analysis, we looked at the
correlation of how much variance there is when asthmatic children are exposed to similar or different environmental factors. The cohort of patients analyzed in
this project were asthmatic African American children from Detroit, Michigan.
Computational Enzymology
Presenters: Stephanie Martin, Meriam Vejiga, Adrian Ramirez
Sponsor: Dr. Jacob Glanville, former Principal Scientist at Pfizer, PhD in Computational and Systems Immunology at Stanford University School of Medicine
and current Chief Science Officer of Distributed Bio.
Faculty Advisor: CS640 Bioinformatics, Professor Patricia Francis-Lyon
Abstract:
Knowing the three dimensional structure of proteins is essential to understanding how the protein functions. Currently protein structures are determined
through x-ray crystallography, which can be difficult and laborious for some projects. Dr. Jake Glanville, CSO of Distributed Bio, created a coding package to
predict the structure of B cell and T cell receptors. This code draws on probabilistic alignment and hidden markov modeling and uses Hmmr3.0 and the
NCBI BLAST toolkit to identify potential templates for homology modeling then generates a model using UCSF’s Modeller. We tested the potential of the script
pdb-getModels.pl to accurately produce models by using an input, self-models=0, to remove any template with more than 95% identity to the query sequence,
ensuring the program didn’t fetch the known crystal structure of the query for homology modeling. After generating hundreds of models, we used another script,
rmsd-Calculate.py, to calculate the root mean squared deviation (RMSD) of the generated model superimposed on the published structure to validate whether
this package has the potential to accurately predict the variable regions of antibodies.
Ozone exposure causes differential expression of genes involved in cell growth and DNA binding
Presenter: Chelsea Yee, Amrita Rishi
Sponsor: Dr. Mehrdad Arjomandi
Faculty Advisor: CS640 Bioinformatics, Professor Patricia Francis-Lyon
Abstract:
Ozone - a gas with high oxidation potential is a major component of air pollution and has been found to damage the respiratory tissues in humans.
To our knowledge, no one has yet published the results of an exacerbation study utilizing ozone as a model for the impact of air pollutants. An ongoing
study by Dr. Mehrdad Arjomandi and associates at UCSF aims to establish the impact of ozone-induced injury and inflammation in asthma and other
lung diseases. Currently, this study aims to determine the differentially expressed genes (DEGs) in subjects, both with and without asthma, that were
exposed to medium (100ppm) and ambient (200ppm) levels of ozone. Gene expression levels for 18 subjects were determined by Affymetrix microarray
in an ozone-exacerbation study performed by Dr. Arjomandi’s team at UCSF. In partnership with Dr. Arjomandi and Prof. Francis-Lyon(USF), our team
performed statistical analysis of the Affymetrix microarray data in R using the limma package to identify DEGs in airway epithelial tissues in response to
ozone exposure. A total of 68 DEGs was determined from the Affymetrix microarray data for all 18 patients. Among the 68 DEGs, 4 were more frequently
differentially expressed (adjusted p-value < 0.1): MAPRE3, HKR1, MOB3B and ZFR. Genes MAPRE3 and HKR1 were up-regulated whereas MOB3B and
ZFR were down-regulated. Further studies providing new knowledge of the function and downstream effects of these genes can lead to the possibility of
new gene therapy and pharmacological targets.
Muse Mobile App
Presenter: MD Naseem Ashraf
Faculty Advisor: CS640 Bioinformatics, Professor Patricia Francis-Lyon
Abstract:
An Android app that leverages Muse headbands to record and transmit eegs from mobile devices easily and quickly.
AI for Princes of California
Presenters: Kyle Baker, Austin Bushree, Cole Howard
Sponsors: Jon Rahoi and Justin Sher. Noo Games
Faculty Advisor: CS490 Senior Project, Prof. Jeffrey Johnson
Abstract:
Princes of California is a strategic board game that is similar to a hybrid of Monopoly and Poker. A single turn consists of playing a tile on the board and
buying up to three shares of any companies that have been built from the tiles on the board. The current built-in opponent makes random moves and is
easily defeated by human players. Our project seeks to use multiple techniques to build a competitive AI opponent for this game.
We will be implementing a heuristic algorithm based on the strategies we have developed while playing the game. We will also be fine-tuning a neural
network using TensorFlow, an open source machine learning package. The network will be trained by playing against random bots. The gameplay tactics
change based on the number of players, so our AI will be trained separately for 2, 3, 4, 5, and 6 player games. The ultimate goal of our project is to build an
AI that strategically places tiles and buys shares of companies to create an entertaining opponent for online players.
Fitness App for Vue Smart Glasses
Presenters: Scott Zhu, Ji Lu, Shengcai Cheng
Sponsor: Jason Gui, Vigo Technologies
Faculty Advisor: CS690 Master's Project, Prof. Olga Karpenko
Abstract:
Vue is a wearable device, a pair of “smart” glasses designed for everyday use. Our team developed the companion app for Vue on iOS and Android.
The app provides fitness features such as step tracking, calorie counting and inactivity alert that help people lead healthier and more active lives. It
also provides some additional features such as finding the device using the app, and delivering notifications.
Visualization of Hierarchical Time-Series Data Using the Sunburst Technique
Presenters: Joey Estella, Marissa Masangcay, Lyndon Ong Yiu, Mohammad Bazarbay
Sponsors: Profs. Sophie Engle and Alark Joshi
Advisor: CS490 Senior Project, Prof. Jeffrey Johnson
Abstract:
The Visualizing Time Series Data project addresses the need to visualize new ways to aggregate large time series data. Often times, data becomes
too large when in its raw form. Then the problem becomes how to aggregate that data, i.e., what kind of metrics (mean, median etc.) and levels (days,
hours, etc.) need to be used to summarize and see patterns and trends from this data.
Our visualization tool attempts to address this problem. Our tool features an interactive dashboard where users are able to view organized data in a
sunburst visualization that displays the data in a meaningful way. Included in the interactive dashboard is an interactive sunburst visualization with a
complementary line chart that corresponds to data in the sunburst. This tool gives added context to otherwise ‘normal’ looking data in order for the user
to gain meaningful and significant conclusions about the data at hand. This tool also features a non-interactive dashboard where various static sunburst
visualizations are displayed in a grid for easy comparisons. These static visualizations feature multiple metrics (mean, median, etc.) across different levels
(days, hours, etc.).
Real Estate Recommendation Engine
Presenters: Rob Reeves, Simon Kwong, Zhe Xu
Sponsor: Jon Rahoi, Ten-X
Faculty Advisor: CS690 Master's Project, Prof. Olga Karpenko
Abstract:
Buying or selling a property is not something most of us do every day. But when the time comes, searching for a new home or office is exhausting. It's stressful,
agitating, and searchers often find themselves settling for less. We developed a recommendation engine for a Ten-X real estate marketplace, that will assist in
alleviating some of that stress. The engine recommends available properties based on past user activity. It uses a graph-based recommendation algorithm that
combines collaborative and content-based filtering.The goal is to maximize the likelihood a user will interact with the recommended properties embedded in the ad.