CS Night 2017

Validating and Restructuring Voter Data for Effective Marketing

Presenters: Chaitanya Mattey, Melanie Baybay, Neha Bandal

Sponsors: Jude Barry and Christopher Weiss, VoterPros

Faculty Advisor: CS 690 Master's Project, Prof. Olga Karpenko

VoterPros is a web-based service that provides candidates with direct-mail marketing tools for various levels of government election anywhere from local school board to statewide elections. Unfortunately, voter data is messy and dynamic as district boundaries often change. Further, the process for retrieving such data from the data provider is incredibly slow. We introduce two solutions: dynamic outlier detection of voter data and a cache to store previous user requests which is achieved by data restructuring. With these additions, the VoterPros web-service is more robust and user-friendly.


Activity Analyzer By Using Sensor Data

Presenters: Anjani Bajaj, Bhargavi Kommineni, Surada Lerkpatomsak

Sponsor: Jim Huang

Faculty Advisor: CS 690 Master's Project, Prof. Olga Karpenko

In this project, we have built an end-to-end system that collected raw accelerometer sensor data from any Android based smartphone for applying the best predictive classifier model that infers if a person is walking or running. The data collection system can scale to automate data aggregation and processing to support any number of data contributors towards the data model pipeline workflow. The native Android client allows the user to generate labeled (walking/running) accelerometer time-series data that gets uploaded directly to our data collection backend system. We visualized the collected data and generated features for the same, which were used to construct and evaluate several classifiers. This full end-to-end system can allow us to further extend future data analysis with more ease.


Design and Development of a New Modality for Unified Scalable Processing in SnapLogic

Presenters: Chengcheng Wang, Yiding Liu, Dingyi Chen

Sponsor: Greg Benson, SnapLogic

Faculty Advisors: CS 690 Master's Project. Prof. Greg Benson, Prof. Olga Karpenko

To date, SnapLogic approaches to data intensive processing have introduced modality in the user-visible programming model. The goal of this project is to develop a proof-of-concept implementation to show how we can use the core of Apache Flink combined with the SnapLogic Standard mode data model and expressions to efficiently support application integration, data intensive processing and streaming computation in a unified manner. We implemented SnapLogic pipeline in both native Flink and integration programs of Flink and SnapLogic code, and compared their performance. Our results demonstrate that it is feasible to use Apache Flink and also support the Standard Mode data model and expressions of SnapLogic. In this way, we can benefit from Flink’s novel de/serialization, memory management, scalable batch computation, and scalable streaming computation.


PagerDuty Incident Dashboard

Presenters: Alec Hsu, Liang Wang, Ethan Wilcox

Sponsor: Richard Just, Twitter

Faculty Advisor: CS 690 Master's Project, Prof. Olga Karpenko

The PagerDuty Incident Dashboard is a web-based visualization tool integrated with PagerDuty, an incident aggregation and dispatch service. PagerDuty aggregates logs from monitoring tools into actionable incidents, and alerts on-call engineers based on specified management policies. The dashboard serves as a tool to aggregate and filter PagerDuty incident data and display temporal trends. It focuses on long-term trends based on both organization categories and temporal groupings. A user can choose various teams, services, and escalation policies to group together in a custom organization, and contrast and compare incident counts and averages for those chosen metrics.


Jenkins Cluster Management

Presenters: Gauri Joshi, Priyam J. Patel, Rushabh Shah, Siwadon Saosoong

Sponsor: Bernhard Gass, Twitter

Faculty Advisor: CS 690 Master's Project, Prof. Olga Karpenko

Jenkins Cluster Management deals with different tools such as Rundeck, Jenkins, Docker, Tomcat and RESTful web services. It is considered as a tool used to automate various tasks in a company environment, which requires less man power. You can automate various tasks such as running jobs on Jenkins via Rundeck, moving of jobs, slaves from one Jenkins master to another.


Better Cave Surveying

Presenters: Mathieu Clément, Rohith Madhavan

Sponsor: John Billings

Faculty Advisor: CS 690 Master's Project, Prof. Olga Karpenko

In this project we built a relatively inexpensive 3D imaging system, using components such as a LiDAR (Light Detection and Ranging) sensor, servomotors, a rotary encoder and a microcontroller. We then assembled the data gathered on the field (e.g. from caves) into a point cloud and used that to create a mesh that can be visualized in a viewer application that we developed.



Presenters: Aishwarya Chandrashekhar, Bindu Balasubramanian

Sponsor: Jon Rahoi

Faculty Advisor: CS 690 Master's Project, Prof. Olga Karpenko

The lunar and solar calendars are widely available, but there is no provision to view them in a unified way. Hence it becomes difficult to understand how the dates and holidays from each calendar relate to each other. Also, In today’s world, it is crucial to have a global mindset. To ensure team collaboration and organizational effectiveness globally, it is important for employees to enhance their awareness of global workplace cultures. The website we developed for this project provides a unified way to visualize all the lunar and solar holidays on a circular Gregorian calendar and allows users to go forward and backwards through the years.


Investigating Full-Body Embodiment in Virtual Reality with Physiological Signals

Presenters: Yi Yang, Bingkun Yang

Sponsor: Prof. Beste Yuksel

Faculty Advisors: CS 690 Master's Project, Prof. Olga Karpenko

We investigate the effects of full-body enabled tracking in virtual reality on users by analyzing their physiological signals using electrodermal activity and electrocardiography. Our preliminary results suggest that users may be responding differently physiologically to different conditions in virtual reality.


Property Management Website

Presenters: Fu Tan, Omer Akin

Sponsor: Jose Alvarado

Faculty Advisor: CS 690 Master's Project, Prof. Olga Karpenko

Property Management Website is a all-in-one platform that serves both property managers/owners and renters. On our website, property managers are able to create and list their properties for rent, receive and review renter applications, charge renters different kinds of fees, receive maintenance requests from renters, as well as visualize revenues and expenses. Renters can search and apply for properties, send maintenance requests to property managers, check a list of payment requests and pay them through the website.


Influential: An Influencer Marketing Platform

Presenters: Jiali Ding, Zhenchao Zhang, Tuo He, Jinjian Guo

Sponsor: Jose Alvarado

Faculty Advisors: CS 690 Master's Project, Prof. Olga Karpenko

Influential is an online platform for brand marketers to manage campaigns, cooperate with regional directors, build connection with social media influencers as well as consumers, and automate payments. It supports multiple roles and provides a full set of functionalities for marketers to implement influencer marketing, including a real-time messaging service. It also offers consumers a channel to voice their opinions by allowing them to comment on reviews written by influencers. Influential currently focuses on restaurants, but can be easily extended to other domains.


Trip Sharing App

Presenters: Xue Kang, LingHsin Hsu, Ruiling Yuan

Sponsor: Scott Zhu

Faculty Advisors: CS 690 Master's Project, Prof. Olga Karpenko

Many people who commute through the city and travel around the world have similar itineraries. There are many benefits of sharing travel plans: it allows users to find companions, make use of carpooling, split travel costs, or share space in the car trunk. We built a Trip Sharing iOS application which helps users make the most of the shared itinerary information. With our trip-sharing app, users are able to create trips, search for trips, ask to join a trip or get feedback from a trip owner.


Appointment Scheduling Website

Presenters: Brent Rucker, Sherry Feng, Karen Diaz Paucar, Marbo Cheng Ye, Catherine Yu

Sponsor: Jose Alvarado

Faculty Advisor: Prof. Beste Yuksel, CS490

Independent contractors find it difficult to promote their services (such as photography, modeling, plumbing, etc.) and availability to the public. It is also challenging for the general public to find short-term services. Independent contractors and the general public turn either to the different social media channels, job search engines, or Craigslist, to either find short-term services or promote their own. Subsequently, matching the availability, finding the service needed, and tracking the payment can be quite a challenge. Our appointment scheduling website allows independent contractors to make their services and availability accessible and viewable to the public, and enables people to book an independent contractor.


Rotation Scheduler

Presenters: Jeremy Kerby, Arseniy Novitskiy, Code Cole, Amanda Fimbres

Sponsors: Richard Just and Remy DeCausemaker, Twitter

Faculty Advisor: Prof. Beste Yuksel, CS490

Rotation Scheduler is a scheduling tool that will intelligently determine who will be on duty for a task. To help with this, we are using a third party service called PagerDuty and their API. Rotation Scheduler is a Python application that implements a MySQL database, user interface, and communicates with a third party API. Furthermore, we have laid the ground work to implement a genetic algorithm in the future. This application has a variety of features and allows for customizable user settings.


Food Waste Reporter

Presenters: Mohamed Elafifi, Mitchell McPartland, Max Sciarra

Sponsor: Mick Washo and Richard Hsu

Faculty Advisors: Prof. Beste Yuksel, CS490

Food Waste Reporter is an online web application that was created for the USF branch of the Food Recovery Network. The Food Recovery Network at USF recovers leftover food from the USF campus and donates this food to various shelters in the Bay Area. Written in Javascript and utilizing frameworks such as React, Node, and Express, this application will be used to collect, store, and process "food waste" data that is reported by the users. The administrators of the USF branch of the Food Recovery Network will use our application to view and edit all of the "food waste" data that is stored in our database. The purpose of this project is to provide a platform for the USF Food Recovery Network in which users can report food waste to the FRN directly and seamlessly.


Pants Rebuild Refactor

Presenters: David Katz, Denali Marsh, Edward Ra, Michael Tran

Sponsor: Yi Cheng, Twitter

Faculty Advisors: Prof. Beste Yuksel, CS490

Our team is contributing to an open source project called Pants with Twitter being the main contributor. Pants, a build system designed to set up large codebases for engineers, impacts not only Twitter, but also Medium, Foursquare, Square, etc. With the help of our sponsors and mentors, Yi Cheng, Nick Howard, Ity Kaul, and Daniel Wagner-Hall, we are adding new features to the tool. This happens through public code reviews on Github, where our sponsors (and anyone) have the opportunity to suggest improvements. If our work looks good, then a mentor merges our code, and we become official contributors to Twitter! As for the features themselves, interaction with Pants happens entirely through the command line. Therefore, we have gradually added new command line options to interact with Pants. So far, we have worked mostly in Python and Go, but also in other languages like Java and Scala since Pants supports a multitude of formats. Overall, learning about the Pants codebase and the open source process has proved highly rewarding and invigorating!


Open Source Metrics

Presenters: Casey Haber, Gordon Li, Nyssa Chennubhotla, Miguel Arreguin

Sponsor: Remy DeCausemaker

Faculty Advisor: Prof. Beste Yuksel, CS490

The Twitter Open Source Team is focused on the creation of a simple yet powerful tool for assessing the health of open source projects. We are creating a better system for driving community involvement and increasing overall code health. Metrics dashboards exist but can be heavy, cluttered and rarely provide the quick facts and data that are needed to drive towards healthier software projects. We are building a lightweight and cost-free visualization based report system that leverages the GitHub API and can be deployed on GitHubPages as a static Javascript bundle. Our report is broken up into a four part narrative: Discovery, Usage, Retention and Activity. Based on these metric categories and the real time comparison of software projects, our report will inform and help drive a better understanding of your Open Source project.


Designing A Difference

Presenters: Malachy Lin-Nugent, Jay Ng, Sonyu Liu, Hongjiang Qiang

Sponsor: Calvin Liang, Designing a Difference

Faculty Advisor: Prof. Beste Yuksel, CS490

The project will help automate Designing a Difference's supply chain service that links local retailers and fashion designers with apparel manufacturers. Currently communication and order requests between the two parties are done manually (pen and paper). With the e-commerce website that we built the process will be much faster, and Designing a Difference will be able to scale its services to be able to handle a greater number and magnitude of orders.


Investigating Affect in Learning Through Posture and Gesture Detection

Presenter: Shengcai Cheng

Faculty Advisor: Prof. Beste Yuksel, Directed Study

We investigate the detection of engagement, boredom, and frustration during learning through posture and gesture detection using computer vision techniques. We use a Microsoft Kinect to measure if the learner is leaning to the left or the right, near or far, or moving. We also detect whether they are raising their hand to their face. We compare our predictions to learners’ subjective self-reports of their affective states.


UX Design Website Prototype "Bakeology"

Presenters: Lorina Dzhamankulova, Prescott Carlson, Eve Jonas

Faculty Advisor: Prof. Jeff Johnson, CS486

Our team is working on the website “Bakeology”, which is a social network for baking enthusiasts that allows them to discover new ideas and recipes with fellow bakers. The goal is to connect people of all ages with the same interests and passion for baking. Initially, we oriented towards an older female audience, but later tests proved that the range of our target audience is much bigger and includes more people of different backgrounds who should be interested in using it. As the result, we now know that we need to build a website that would be engaging and user friendly for everyone. In order to achieve this, we have created our first prototype of our product that will be improved based on the test results and feedback from the users tests that we will perform with participants of different age and backgrounds.


Files with Friends

Presenters: Aishwarya Chandrashekhar, Mathieu Clément

Faculty Advisor: Prof. Jeff Johnson, CS686

In this day and age of the digital era, file storage and file sharing are a major use case for most cloud services. Our app aims to allow friends to share files (documents, music, images) between each other and retrieve them later. The “friends” use instant messaging to let others know about new files. Alternately, users will also be notified when a file is shared with them.


DMV Appointments Redesign

Presenters: Neha Bandal, Rohith Madhavan

Faculty Advisors: Prof. Jeff Johnson, CS686

The existing process for scheduling appointments on DMV App is cumbersome and the idea is to streamline the process and make it more user friendly. In the new prototype, we are proposing step by step process to schedule appointment which is easy to follow and more interactive.


Simple Raspberry Pi Clusters

Presenters: Derek Dang, Alec Taggart

Faculty Advisor: Prof. Greg Benson, CS 398 Raspberry Pi Cluster Design

A cluster is a group of computers connected together via a network. Clusters are used to run parallel and distributed programs, provide distributed services, and are the basis for cloud computing. We have developed cluster software to make it extremely easy to configure and run a cluster of Raspberry Pi computers. Our software provides a platform to explore parallel and distributed concepts and implementations, especially in the context of a class. We believe with our software and low-cost Raspberry Pi Zero computers it is feasible for every student to have their own cluster for projects and coursework. Our implementation results in two modified images, which can be generated from the official Raspbian pi-gen tool, the server and client, to easily boot up a fully configured cluster. For more information, visit: http://rpicluster.cs.usfca.edu/


The Design and Implementation of Containers for xv6

Presenters: Marcus Chong and David Katz

Faculty Advisor: Greg Benson (CS 326 Operating Systems)

The xv6 operating system is a reimplementation of Unix Version 6 for x86 processors and the PC architecture from MIT that we use in the CS 326 Operating Systems class at USF. Containerization allows operating systems to fully isolate an execution environment that is lighter weight than full virtualization. Containers can fully isolation the process namespace, the file system, and physical resources such as CPU time and memory. For service deployment, containerization greatly benefits engineering organizations that create multiple products on the same host. On the individual scale, containerization helps protect your files and data from malicious or buggy behavior while also simplifying the packaging of software dependencies. While pioneered primarily at Google, we now see widespread use of containers using Docker. Our xv6c project extends xv6 to support containerization. We have file system and process isolation, as well as fair scheduling and some additional features.


Predicting Phenotype from Genotype with Machine Learning

Presenter: Rob Reeves

Faculty Advisor: Prof. Patricia Francis-Lyon, Bioinformatics

Genomic variants such as Single Nucleotide Polymorphisms (SNPs) are known to be a major factor influencing many physical traits, diseases, and other phenotypes. With the rise of economical DNA sequencing/genotyping services such as 23andMe, publicly available genomic data is growing exponentially. This presents an opportunity to use genomic data for health risk assessments and predictive analytics.

This project applies supervised machine learning, without domain knowledge, to publicly available genomic data to predict a phenotype from SNP values alone, and identify SNPs and their interactions that are important to the disease or trait. The code base was structured and engineered according to best practices for ease of use by citizen scientists who can apply it to the prediction of a variety of diseases or traits. As a proof of concept this project predicted eye color with 89% accuracy and succeeded in identifying from ~1 million SNPs those that are most influential to eye color prediction. All genes known to be influential in eye color were detected, along with a known polygenic interaction. The next step is to apply this technique to identify novel SNPs and their interactions in predicting a phenotype such as Schizophrenia.


Simulating a large dataset of ECG readings for MS training site

Presenters: Anjani Bajaj, Rushabh Shah, Max Alfaro

Sponsor: Dr. Robert Horton, Microsoft

Faculty Advisor: Prof. Patricia Francis-Lyon, Bioinformatics

Our project is to simulate electrocardiography (ECG) data for use in health care analytics demonstrations and exercises. We started by writing a Shiny app to scan through an ECG dataset to find good examples of ECG waveforms, then wrote a second Shiny app to fit a parameterized curve to a selected waveform. By fitting the simulated waveform to the peak locations generated by a rhythm simulator developed in an earlier student project, we will be able to generate simulated ECG data that contains statistical signals in both heart rate and heart rate variability. We have also developed functions to compactly encode this data in JSON format for transmission to the cloud.


Uncovering Bias in Deep Learning Model for Disease Prediction

Presenter: Melanie Baybay, Chaitanya Mattey, Alec Hsu

Sponsor: Tom Brander, Influence Health Inc.

Faculty Advisor: Prof. Patricia Francis-Lyon, Bioinformatics

We aim to improve Influence Health’s Deep Learning Model for Disease Prediction. The output from this model is analyzed by searching for insights and patterns to inform better adjustments and improvements in future iterations of the Model. These improvements are achieved by analyzing the demographics and disease history of various patients. We observe cases where the model performs extremely well, cases where certain diseases are confused with each other and used this information to improve the next iteration of the model.


Investigation of TCGA in search of therapy target for triple negative breast cancer

Presenters : Luika Timmerman , Rashmi Manjunath

Sponsor: UCSF Helen Diller Family Comprehensive Cancer Center

Faculty Advisor: Prof. Patricia Francis-Lyon, Bioinformatics

There is a subtype of breast cancer that is associated with aggressive progression, high levels of recurrence and the affliction of younger women. These are known as triple-negative/basal-like breast cancers (TNBC), as they test negative for expression of three proteins, two of which are the target of therapeutics in common use in the breast cancer clinic (estrogen, progesterone and HER-2 receptors). A specific therapeutic target has not yet been found to treat TNBC and prognosis remains poor for women with these tumors. In an effort to identify such a target for triple-negative breast cancer (TNBC), the Timmerman lab at the UCSF Helen Diller Cancer Center is working to gain an understanding of the mechanisms and genomic relationships involved in unregulated cellular proliferation in TNBC. Potential targets have been identified for the development of drugs that target tumor metabolism.

As part of that effort, bioinformatics investigations utilizing genomic and proteomic data to gain insights that might be actionable in the treatment of breast cancer were conducted. The TCGA breast cancer dataset was explored for patterns of genomic alterations and expression that are associated with breast cancer subtypes, noting particularly how these differ with TNBC as opposed to other breast cancer subtypes. A principal components analysis has been conducted of expression levels of PAM50 genes in the cancers of 1208 patients, as well as PCA analysis of the differential expression of these genes in the tumor vs normal breast tissues of patients. Exploration of the potential targets identified by the Timmerman lab have been conducted. Additionally, clustering techniques have been employed to infer missing subtypes so as to augment the basal-like category in the TCGA breast cancer dataset for future analysis.