Dr. Amit Sheth
Amit P. Sheth
Founding Director, AI Institute@UofSC
NCR Chair and Professor, Computer Science & Engineering

Project History

(Listed in reverse chronological order by start date.)


R21DA044518 Lamy, Sheth (PIs) 06/01/2017 - 05/31/2019
NIH $223,715
eDarkTrends: Monitoring Cryptomarkets to Identify Emerging Trends of Illicit Synthetic Opioids Use:
The overall purposes of the proposed study are to (a) characterize illicit synthetic opioid marketing characteristics and trends, and (b) identify new, emerging illicit synthetic opioid terms (e.g., substance names, product forms) from three "cryptomarkets" located in the Deep Web. The Specific Aims of the study are to: 1) Develop a semi-automated knowledge-based system, eDarkTrends, to collect and process data about illicit synthetic opioids supplied on cryptomarkets; 2a) Describe and monitor US-based supply trends of illicit synthetic opioids on cryptomarkets (e.g., trends in availability of non-pharmaceutical fentanyl analogs, U-47700, MT-45), including types of illicit synthetic opioids, prices, advertised purity, dosage and product forms, quantity supplied, and drug combinations; 2b) Identify new illicit synthetic opioid substances and product forms soon after they appear on cryptomarkets.


R01 HD087132-01 Sheth (PI) 07/01/2016 – 06/30/2019
NIH Prasad, Alter (Co-Investigators) $991,804
SCH: kHealth: Semantic Multisensory Mobile Approach to Personalized Asthma Care:
Asthma affects over 300 million people, claiming over 250,000 worldwide annually. Using low cost sensors and an intelligent mobile application, this research will develop algorithms and techniques to empower patients and doctors by providing personalized, contextually-relevant, and actionable information synthesized from a comprehensive understanding of health-relevant data. The proposed research will uncover correlations among extracted sensor features, their relationship to the control and severity of asthma, and will personalize management of asthma.

W911-NF-16-1-0300 Minnery (PI) 05/06/2016 - 02/15/2017
ARO Sheth, Shalin (Co-PIs) $249,034
Maximizing the Collective Intelligence of a Network Using Novel Measures of Socio-Cognitive Diversity:
We will explore the degree to which it is possible to augment “wisdom of crowd” effects by developing novel, theory-based measures of socio-cognitive diversity and using them to select smaller, smarter sub-crowds. Socio-cognitive diversity refers to differences in individuals’ prior beliefs and information sources, including information acquired through social interaction. We will consider the case of networked crowds in particular–that is, groups of communicating individuals who share information in the process of arriving at a judgment (for example, a group of military intelligence analysts who work together to predict the location of a high-value target). We focus on networks because (a) real-world analysis and decision-making typically involve some degree of collaboration; (b) communications among members (i.e., who said what to whom) constitute a rich data source from which measures of diversity can potentially be extracted using automated methods.

IIS 1622628 Sheth (PI) 03/01/2016 - 02/28/2017
NSF $30,000
III: Travel Fellowships for Students from U.S. Universities to Attend ISWC 2016:
This National Science Foundation award funds Student Travel Fellowships for US students attending the 15th International Semantic Web Conference (ISWC 2016). The conference, which will be held in Kobe, Japan from October 17 to 21, is the premier major international forum for state-of-the-art research on all aspects of the Semantic Web and data on the Web—the next generation World Wide Web. This allows students to meet key members the Semantic Web research community, it gives them the opportunity to disseminate their work, and it provides a venue for them to interact with future national and international scientific collaborators.

R01 MH105384-01A1 Pathak*, Sheth (PIs) 03/01/2016 - 06/30/2019
NIH Prasad (Co-PI) Total: $2,195,362
WSU: $505,602
Modeling Social Behavior for Healthcare Utilization in Depression:
Depression is one of the most common mental disorders in the U.S. and is the leading cause of disability affecting millions of Americans every year. Successful early identification and treatment of depression can lead to many other positive health and behavioral outcomes across the lifespan. This proposal will apply "big data" techniques and methods for identifying combinations of online socio-behavioral factors and neighborhood environmental conditions that can enable detection of depressive behavior in communities and studying access and utilization of healthcare services.

*Cornell University.

Sheth (PI) 01/01/2016 - 06/30/2017
Universal Freelancer $60,000
Employee and Job Search Semantic Engine: Phase I:
This project will conduct research and initial prototyping of an Employee and Job Search Semantic Engine, subject to the available resources.


IIP 1542911 Sheth (PI) 09/01/2015 - 08/30/2017
NSF Mackay (Co-Investigator) $200,000 + $24,000 (REU)
PFI: AIR-TT: Market-driven Innovations and Scaling up of Twitris - A System for Collective Social Intelligence:
Twitris is a comprehensive analytical tool which can provide professional users with actionable information for making better decisions from social media data. The proposed effort, along with potential customers, seeks to develop and incorporate new innovation that take Twitris on a path towards commercialization. Specific innovations planned include: functional enhancements such as broad range of location-specific processing, intuitive user-guided and background knowledge supported analysis and visualization, and cloud computing based scaling to meet real-time processing needs of large-scale, real-world events.

IIS 1513721 Sheth (PI) 09/01/2015 - 08/31/2018
NSF Shalin, Prasad (Co-PIs) $925,104 + $32,000 (REU)
TWC SBE: Medium: Context-Aware Harassment Detection on Social Media:
The aim of this project is to develop comprehensive and reliable context-aware techniques (using machine learning, text mining, natural language processing, and social network analysis) to glean information about the people involved and their interconnected network of relationships, and to determine and evaluate potential harassment and harassers. An interdisciplinary team of computer scientists, social scientists, urban and public affairs professionals, educators, and the participation of college and high schools students in the research will ensure wide impact of scientific research on the support for safe social interactions.

NC-5521 Prasad (PI) 07/01/2015 - 11/06/2016
Milcord LLC/ONR DoD SBIR/STTR Phase I and Option Sheth (Co-PI) $101,351
Medical Information Decision Assistance and Support (MIDAS) - Phase I and Option:
In this SBIR project, Milcord and Kno.e.sis propose to research, design, and develop a Medical Information Decision Assistance and Support knowledge base, with a mobile application front end for medical practitioners to both communicate treatment plans with and receive status updates from their patients. The goal is to seed the knowledge base with medical and patient care concepts using existing ontologies, and to populate instances of treatment plans, disease symptoms, and other information required for assisting practitioners with understanding the efficacy of treatment plans.

EAR 1520870 Parthasarthy*, Sheth (PIs) 07/01/2015 - 07/31/2019
NSF Liu*, Kubatko*, Shalin, Prasad
Total: $2,000,000
WSU: $787,500
Hazards SEES: Social and Physical Sensing Enabled Decision Support for Disaster Management and Response:
In this project the team will design novel, multi-dimensional cross-modal aggregation and inference methods to compensate for the uneven coverage of sensing modalities across an affected region. By assimilating data from social and physical sensors and their integrated modeling and analysis, methodology to predict and help prioritize the temporally and conceptually extended consequences of damage to people, civil infrastructure (transportation, power, waterways) and their components (e.g. bridges, traffic signals) will be designed. The team will also develop innovative technology to support the identification of new background knowledge and structured data to improve object extraction, location identification, correlation or integration of relevant data across multiple sources and modalities (social, physical and Web). Novel coupling of socio-linguistic and network analysis will be used to identify important persons and objects, statistical and factual knowledge about traffic and transportation networks, and their impact on hazard models (e.g. storm surge) and flood mapping. Domain-grounded mechanisms will be developed to address pervasive trustworthiness and reliability concerns. Exemplar outcomes are expected to include specific tools for first-responders as well as recovery teams to aid in the prioritization of relief and repair efforts, leveraging improved flood response, urban mapping, and dynamic storm surge models, and interdisciplinary training of students leveraging research in pedagogy, in conjunction with Ohio State University’s new undergraduate major in data analytics, and Wright State University’s Big and Smart Data graduate certificate program.

*Ohio State University.

Prasad (PI) 07/01/2015 - 09/30/2015
Pratt & Whitney Sheth (Co-PI) $89,736
Semantic Web-based Data Exchange and Interoperability for OEM-Supplier Collaboration:
This project will develop a demonstration prototype for OEM-Supplier collaboration for the MAI Data Informatics. Specifically, the web application will have the capability to electronically exchange data and models based on Semantic Web technologies between OEMs and suppliers on an ongoing regular basis subject to security and access control requirements for both sites.

Sheth (PI) 01/01/2015 - 09/30/2017
Ohio Office of Criminal Justice Services Doran, Dustin (Co-PIs) $140,000
Westwood Partnership to Prevent Juvenile Repeat Violent Offenders:
Project Safe Neighborhood is an interdisciplinary project involving the Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) located at Wright State University with other community partners including the city of Dayton (Dayton Police Department), Montgomery County Juvenile Justice, and the University of Dayton to prevent juvenile repeat offenders from committing crime in the Westwood neighborhood located in Dayton, Ohio.

Recipient: Perera 2015 - 2016
Health Data Consortium Sheth (PI) $10,500
2015-2016 George Thomas Fellowship
An unrestricted gift of $10,500 to Wright State University from the Health Data Consortium to cover Sujan Perera’s stipend for the 2015-2016 academic year.


R01 DA039454-01 Sheth, Daniulaityte, Boyer*, Martins**(PIs) 08/15/2014 - 08/14/2017
NIH/NIDA Carlson, Nahhas, Prasad (Co-PIs) Total: $1,689,019
Trending: Social Media Analysis to Monitor Cannabis and Synthetic Cannabinoid Use (eDrugTrends):
The major goals of the project is to: (1) develop a comprehensive software platform, eDrugTrends, for semi-automated processing and visualization of thematic, the sentiment and spatio-temporal dimensions, and the social network dimensions of social media data (Twitter and Web forums) on cannabis and synthetic cannabinoid use; (2) deploy eDrugTrends (a) using Twitter and Web forum data to identify and compare trends in knowledge, attitudes, and behaviors related to cannabis and synthetic cannabinoid use across the regions in the U.S. with different cannabis legalization policies, and (b) analyze social network characteristics and identify key influencers (opinions leaders) in cannabis and synthetic cannabinoid-related discussions on Twitter.

*University of Massachusetts Medical School; **Columbia University.

R56 DA038366-01 Sheth, Boyer*, Carlson (PIs) 07/01/2014 - 08/31/2016
NIH/NIDA Daniulaityte, Nahhas (Co-PI) $299,741
NIDA National Early Warning System Network (iN3): An Innovative Approach:
The goal of this study is to establish an advanced surveillance system for emerging drug use in the U.S. by harmonizing data from web-based sources associated with emerging drug use and a network of sentinel sites for emerging drug use staffed by medical toxicologists.

*University of Massachusetts Medical School.

Sheth (PI) 06/02/2014 - 06/01/2015
ezDI $125,000
Cardiology Semantic Analysis System:
This project will conduct research and prototyping of a clinical text understanding system using semantics enhanced natural language processing techniques, with applications to Computerized Document Improvement and Computer-assisted Coding.


FA2386-13-1-3023 Wang (PI) 9/30/2013 - 9/29/2014
DoD, AFOSR Sheth, Chen, Zhang (Co-PIs) $492,500
Software and Hardware Infrastructure for Energy-efficient, Large-scale, Complex Language Modeling:
Shaojun Wang, an associate professor in the Department of Computer Science and Engineering, received $492,500 for software and hardware infrastructure supporting energy-efficient, large-scale complex language modeling. Wang, the principal investigator, received this highly-competitive research equipment award along with co-principal investigators Keke Chen, Amit Sheth and Junjie Zhang. They were one of 29 research groups across the country receiving a total of $12.7 million in awards, funded by the Air Force Office of Scientific Research through the Defense University Research Instrumentation Program (DURIP). The Air Force Office of Scientific Research received 260 proposals requesting $99,852,968 in support for research equipment. Collectively, the Army Research Office, the Office of Naval Research and the Air Force Office of Scientific Research received more than 750 proposals requesting $288 million in support for research equipment.

IIP 1343041 Sheth (PI) 07/01/2013 - 12/31/2013
NSF $50,000
Towards Commercialization of Twitris—A System of Collective Intelligence:
This project plans to further develop a comprehensive platform to compute Collective Social Intelligence from social media data. It currently uses Twitter as a starting point, and brings in news, multimedia, Wikipedia, and Linked Open Data as complementary data/background knowledge for analysis. Its detailed analysis encompasses the spatio-temporal-thematic (where, when, what), people-content-networking (who and how), and emotion-sentiment-intent (perceptions & impact) domains. Given that social media is predominately used as a means of delivering messages, this technology is designed to gain insights and predict trends using data gathered from social media. Additionally, it plans to support transitioning research on advanced social media analysis techniques and early state technology into real world products and services to help improve the quality of information available to decision-makers and their products.

SUB 1122472-001 Sheth (PI) 07/01/2013 - 07/30/2014
AFRL/RX Prasad, Srinivasan (Co-PIs) $190,000
Materials Database Knowledge Discovery and Data Mining:
With the collaboration of AFRL/RX, the Kno.e.sis Center is introducing the materials and process community to better data management practices. A data exchange system will be created that will allow researchers to index, search, and compare data while also enabling a shortened transition cycle in material science, which usually takes 5 to 10 years. This multi-disciplinary project will develop material ontologies that establish a common vocabulary and the associated tools to manipulate the structured/unstructured corpus. These ontologies focus primarily on an in-depth biomaterials ontology and a broader materials research ontology. We will also illustrate the use of the ontology we create to semantically search for relevant KDDM biomaterials information quickly. Datasets and software developed as part of this project will be made available to the broader research community in material science.

FA 8750-13-1-0244 Sheth (PI) 06/01/2013 - 12/31/2015
AFRL Prasad, Srinivasan (Co-PIs) $309,028
SemMat: Federated Semantic Services Platform for Open Materials Science and Engineering:
The major goal of the project is enable flexible sharing and exploitation of materials data through three tasks. The first is related to creating semantic infrastructure to annotate materials data utilizing materials domain models and knowledge bases. The second relates to semantic search for a varieties of data, including resources with services based access. The third relates to development of a novel open semantic data exchange scheme for materials science (termed Linked Open Materials Data).


IIS 1111182, 1237631, 1349151 Sheth, Parthasarthy* (PI) 09/01/2011 - 08/30/2015
NSF Shalin, Flach $518,408
SoCS: Collaborative Research: Social Media Enhanced Organizational Sensemaking in Emergency Response:
There have been many examples of using Twitter to provide timely and situational information about emergencies to relief organizations, and to conduct ad-hoc coordination. However, there are few attempts to understand the full ramifications of using social networks in a more concerted manner for effective organizational sensemaking in such contexts. This multi-disciplinary project, spanning computational and social sciences, seeks to fill this gap.

*Ohio State University.

IIS 1143717 Sheth(PI) 09/2011 - 08/2013
NSF Hitzler (Co-PI) $141,828
EAGER: Expressive Scalable Querying Over Linked Open Data:
This project develops exploratory techniques to richly interlink components of LOD and then addresses the challenge of querying the LOD cloud, i.e., of obtaining answers to questions which require accessing, retrieving and combining information from different parts of the LOD cloud. Techniques for overcoming semantic heterogeneity include: semantic enrichment through Wikipedia bootstrapping; semantic integration through abstraction by means of upper-level ontologies; and, massively parallel methods for tractable ontology reasoning. Specifically, this research will: (1) identify richer, broader, and more relevant relationships between LOD datasets at instance and schema level (these relationships will promote better knowledge discovery, querying, and mapping of ontologies); (2) realize LOD query federation through an upper level ontology; and, (3) enable access to implicit knowledge through ontology reasoning.

R21 DA030571-01A1 Sheth, Daniulaityte (PIs) 07/01/2011 - 06/30/2013
NIH/NIDA Carlson, Falck (Co-PIs) $401,500
A Study of Social Web Data on Buprenorphine Abuse Using Semantic Web Technology:
Research involving semantic web and NLP techniques for developing ability to aggregate and analyze Web-based forums about burprenorphine/naloxone and buprenorphine abuse practices. A collaboration between epidemiologists at CITAR and Computer Scientists at Kno.e.sis.

DRC C00557P 0011596 02 Sheth(PI) 06/2011 - 12/2012
Riverside Research Prasad (Co-PI) $81,539
Trusted Semantic Sensor Web to Support Decision Making Over Massive Amounts of Sensor and Social Data:
The goal of this project is to create integrated access to a variety of multimodal sensor and social data which centers on events and their analysis. This is done in a way so that humans are presented with highly relevant information at a level of abstractions that lends itself to better and more timely decision-making. While many efforts in semantic sensor web and semantic social web show promise in this area, it is critical that trustworthiness of observational data as well as reported information at higher level abstraction be an integral part of any system that is of value to military decision makers.

Sheth (PI) 04/2011
Janya $25,150
Enhancing Information Extraction with linked Open Data:
The WSU research will perform tasks related to information extraction with linked open data.

Dept. of Community Health Sheth (PI) 01/2011
Wright State University $30,000
A Study of Social Web Data on Oxycontin Abuse Using Semantic Web Technology:
As part of the PREDOSE (PREscription Drug Abuse Online-Surveillance and Epidemiology) project, the goal is to develop automated data collection and analysis tools to process social media (tweets, web-forums) to understand the knowledge, attitudes, and behaviors of prescription-drug abusers who misuse buprenorphine, OxyContin, and other pharmaceutical opioids. Instead of relying on traditional epidemiological surveillance methods such as population surveys or face-to-face interviews with drug-involved individuals, PREDOSE focuses on the web, which provides venues for individuals to freely share their experiences, post questions, and offer comments about different drugs. Such User-Generated Content (UGC) can be used as a very rich source of unsolicited, unfiltered, and anonymous self-disclosures of drug use behaviors. The automatic extraction of such data enables qualitative researchers to overcome scalability limitations imposed by existing methods of qualitative studies.

Microsoft Corporation Sheth (PI) 2011
SemGrail 2011:
This was an unrestricted gift from Microsoft that did not come with an abstract. Microsoft’s SemGrail concept describes how one can create and use semantics at Internet scale and how semantics can be utilized to support confidentiality.


FA8650-08-D-6801 Sheth (PI) 11/2010 - 10/2013
Ball Aerospace $306,000
T040: LVC Sensors Integration for Data Fusion in Operations and Training:
The focus of this task involves reviewing, providing, and organizing expert-level capability commensurate with the most relevant work in the disaster response training and operations domain as well as obtaining and maintaining knowledge of complementary domains that are developing environments and capabilities for supporting and executing training and operations. Specific technologies of interest include trusted sensing devices and architectures, hand-held communications devices (e.g. cell phones, PDAs, microcomputers, etc.), data networks, signal processing, and data analysis and management. The demonstration and validation of the software and hardware architecture will encompass initial integration testing, the definition and execution of a simulated warfighter scenario to demonstrate data capture and management capability, including recording, replaying, real-time data analysis, and visualization, concluding with a demonstration of a proof-of-concept.

FA8650-09-D-6939 Sheth (PI) 05/2010 - 12/2010
Woolpert $42,500
Information Operations/Cyber Exploitation Research (ICER) Program, City Beat:
The WSU Center of Excellence on Knowledge-enabled Computing will support the City Beat program by extending and integrating on-going research focused on event analysis using social media and the mobile web.

Sheth (PI) 04/2010 - 03/2011
AFRL, WPAFB $300,000
Ontology and Semantic-Aided Human Cognition Knowledge Discovery:
Develop a human cognitive performance (HCP) ontology (HCPO) that will serve as an umbrella covering relevant aspects of human performance (e.g. cognitive performance, physical performance, immune performance, performance degraders, etc.). This knowledge base will help address issues across a continuum of granularity (i.e. from gene → pathway → cell → organ → individual performance and ultimately to group performance and human-system integration). Stressors that can affect human cognition will be included under appropriate categories (see Fig. 3).


06/01/2008 - 05/31/2009
AFRL & OGC/DAGSI $63,281
Architectures for Secure Semantic Sensor Networks for Multi-Layered Sensing:
The WSU student and his mentor will present an approach to annotating sensor data with spatial, temporal, and thematic semantic metadata, building on the current standardization efforts within the W3C and Open Geospatial Consortium.

Sheth (PI) 06/2009
Wright State University Prasad, Wang, Rizki, Wang,
Mateti, Liu, Gallagher, Chen, Pei, Raymer, (Co-PIs)
Cloud Computing Collaboratory:
The use of cloud computing in the classroom has the benefit of educating our students in rapidly emerging and highly sought after technologies. A number of universities have alrady begun to teach courses on cloud computing and/or use it as an infrastructure to support instruction in a variety of computing intensive courses. In this proposal, a significant number of faculty members in the Computer Science & Engineering Department request a shared infrastructure for undergraduate and graduate instruction using the House Bill 562 funding. Existing open-source software such as Hadoop will be installed on the cloud, and the students will use it to study novel techniques for solving extremely large-scale computing problems as well as infrastructure and security issues in enabling cloud computing, such as dynamic network layer protocols for advanced cloud computing, resource management, trust, security, privacy, etc.

Sheth (PI) 02/2009 - 07/2011
AFRL $176,890
Semantic Sensor Network:
This proposal will support the objective of the Modeling, Simulation, and Analysis Technology (MSAT) program which intends to provide qualitative and quantitative analysis of integrated Air Force Research Laboratory (AFRL) information and sensors technologies, systems, and processes. With the scope of MSAT, this effort will relate to the trust in networks based on metadata transferred with the sensor data. The statement of work (SOW) incorporated herein will specifically support Science Applications International Corporation (SAIC) by investigating the issues of trust in a semantic web environment.


IIS 0842129, 0937647 Sheth (PI) 08/2008 - 12/2011
NSF Prasad (Co-PI) $146,952
Spatio-Temporal-Thematic Queries of Semantic Web Data—A Study of Expressivity and Efficiency:
This exploratory research develops of new methods for modeling and querying spatial, temporal and thematic (STT) data. The methods differ significantly from traditional approaches for STT data management; they follow a paradigm that goes beyond querying for resources to querying about the relationships between resources. Three STT data management advances this will lead to are: (1) new query operators that exploit the graph-centric nature of Semantic Web data models, (2) new indexing and query processing techniques for STT data that are specialized for Semantic Web data models and (3) an extension of the SPARQL RDF query language to support STT queries.

FA8650-05-2-6518 Sheth, Prasad (PIs) 07/2008 - 03/2009
AFRL $100,000
Human Performance Ontology (HPO):
The project involves extending our work in focused knowledge (entity-relationship) extraction from scientific literature, automatic taxonomy extraction from selected community authored content (eg Wikipedia), and semi-automatic ontology development with limited expert guidance. These are combined to create a framework that will allow domain experts and computer scientists to semi-automatically create knowledge bases through an iterative process. The final goal is to provide superior (both in quality and speed) search and retrieval over scientific literature for life scientists that will enable them to elicit valuable information in the area of human performance and cognition.

Sheth (PI) 07/2008 - 03/2009
Undisclosed Sponsor $120,000
Trusted Semantic Sensor Web:
Trust and confidence are becoming key issues in diverse applications such as ecommerce, social networks, semantic sensor web, semantic web information retrieval systems, etc. Both humans and machines use some form of trust to make informed and reliable decisions before acting. In this work, we briefly review existing work on trust networks, pointing out some of its drawbacks. We then propose a local framework to explore two different kinds of trust among agents called referral trust and functional trust that are modeled using local partial orders, to enable qualitative trust personalization. The proposed approach formalizes reasoning with trust, distinguishing between direct and inferred trust. It is also capable of dealing with general trust networks with cycles.

R01 HL087795-01A1 Sheth (PI) 05/2008 - 06/2012
NHLBI Tarleton*, Musen**, Noy**,
Doshi* (Co-PIs)
Semantics and Services Enabled Problem Solving Environment for Trypanosoma cruzi:
Collaborative R01 led by Wright State University with University of Georgia and Stanford University as partners. The scientific analysis of the parasite T. cruzi, the principal causative agent of Chagas disease in humans. The SPSE allows data analysis and knowledge discovery through the dynamic integration of lab and public data to answer biological questions at multiple levels of granularity. (Also supported by an ARRA supplement.)

*University of Georgia; **Stanford University.

Sheth (PI) 02/2008
Microsoft Research $60,000
Chatter, Intent, Good Karma, and Contextual Advertisements in Social Networks:
Advertising approaches to monetizing user content on social networking sites (SNSs) are profile-based contextual advertisements, demographic-based ads, or a combination of the two. Content-based or contextual ads are generated by automatically finding relevant keywords on a network page and displaying ads based on those keywords. Such advertising on SNSs uses information—such as interests and activities—on member profiles for delivering ads. While profile information might be useful for launching product campaigns and micro-targeting customers, it does not necessarily contain current user interests or purchase intents. Ads generated from such content are inherently less relevant to a user. Over time, this leads to a scenario where ad campaigns see several ad impressions but very few click-throughs. The proposed research will exploit user activity on public venues (such as forums, marketplaces, and groups) on SNSs in addition to their profile information for generating ads on user profiles. The core components of the research is identifying monetizable user activity on SNSs, observing user response behaviors, and extracting keywords and phrases in user posts while eliminating off-topic chatter.


Sheth (PI) 11/2007
IBM $23,000
UIMA-based Infrastructure for Summarizing Casual, Unstructured Text:
A proposal based on Meena Nagarajan's summer 2007 internship with the Semantic Super Computing group at IBM Almaden.

Sheth (PI) 04/2007 - 08/2007
AFRL $32,000
Sensor Data Management:
Provides centralized data management support to the sensor exploitation research and development community throughout the Department of Defense.


Sheth (PI) 06/01/2006 - 06/30/2018
Advanced Data Management Resource (Ohio's Research Challenge 2005-2007 Biennium):
These funds will be used to acquire, renovate, rehabilitate or construct facilities and purchase equipment to be used by an Eminent Scholar in the conduct of research.


IIS 0545243 Sheth (PI) 10/2005 - 12/31/2007
NSF Arpinar, Kochut, Miller (Co-PIs) $100,000
SemGrid (Semantic Discovery on Adaptive Services Grid):
This early stage NSF funded project collaborates with large EU funded ASG project. It involves investigating the use of semantic associations in Web Service discovery and Dynamic Web Process composition, and computing Semantic Associations over the grid.

Sheth (PI) 04/2005 - 09/2006
ARDA $325,000
SemDis: Financial Irregularity Detection:
This research involves an ontological approach to financial analysis and monitoring by extracting, disambiguating and merging financial data from multiple heterogeneous sources into a common ontological framework. Analysis of the data and inference of semantic relations may then be conducted in a coherent, incorporated and consistent manner. Financial inconsistencies and/or "suspicious activity" can then be detected automatically. MathML, the Mathematical Modeling Language, is utilized to represent financial formulas and then extended to provide semantic provenance of data within an ontology.


IIS 0738251, 0714441 Sheth (PI) 10/01/2003 - 09/30/2007
NSF Arpinar, Kochut, Miller, Joshi,
Finn, Yesha (Co-PIs)
SemDis: Discovering Complex Relationships in Semantic Web:
This NSF-funded medium-ITR project involves modeling, discovering and reasoning about complex relationships on the Semantic Web that will transform the hunt for documents into a more automated analysis leading to insight and knowledge discovery. Among various output from the project are SWETO (Semantic Web Technology Evaluation Ontology), TOntoGen (Test Ontology Generation Tool), BRAHMS (A WorkBench RDF Store And High Performance Memory System for Semantic Association Discovery), and several algorithms for semantic association discovery over large RDF graphs, relationship based document ranking and ranking of complex relationships. (Note: $800,000 of the awarded funds went to the University of Georgia, and 450,000 went to the University of Maryland, Baltimore County.)

Sheth (PI) 07/01/2003 - 06/302008
NIH Kochut, Miller (Co-PIs) $709,401
Bioinformatics for Glycan Expression:
This project is a part of a larger National Cancer Research Resource center, and involves substantial collaborations between LSDIS researchers and biologists at the Complex Carbohydrate Research Center. Key results include GLYDE (a representation standard that is being adopted by community of Glycomics researchers), GlycO (very large and comprehensive with 600+ classes, 11 levels deep), ProPreO (a large ontology that captures the processes used in high throughput experiments), a tool for semantic search and browsing of large populated ontologies, development of bioinformatics semantic web services (using WSDL-S) and directory (semantic UDDI), semantic annotation of non-textual experimental data, etc. Recent work involves investigating pathway development workbench for genomic researchers, with integrated access, analysis and discovery support covering experimental as well as textual data.

Sheth, Bhandarkar, Li, Waterson
System-Level Technique for Energy-Aware Computing:


0219649 Sheth (PI) 07/01/2002 - 06/30/2005
NSF Arpinar, Kochut (Co-PIs) $212,000
SAI (Association Identification and Knowledge Discovery for National Security Applications):
The role of information technology (IT) is recognized to be a critical component in the effort of improving national security, including homeland defense. Applications of importance to national security, such as aviation security, pose significant challenges to current information technology and provide excellent source for further research in developing next generation IT solutions. Recently, there is significant advance in applying techniques from database and information systems, knowledge representation, AI, information retrieval including text categorization, lexical and language analysis and others in developing a new generation of semantic technologies. Semantic technologies help in associating meaning of data and in more meaningfully organizing data, in meaningfully correlating data, as well as in converting data into information for more effective decision making and in finding information that contextually relevant to users' needs. They help with syntactic and representational as well as semantic interoperability. This general area of research is also getting renewed attention now that there is considerable excitement in the vision of the Semantic Web, characterized as the next phase of the Web.

Sheth (PI) 04/01/2002 - 03/31/2003
NSF $26,250
Database and Information Systems Research for Semantic Web and Enterprises:


Sheth (PI) 01/01/2000 - 09/30/2000
NRL, ITT $200,000
Extending METEOR with Workflow Reuse, Adaptation, and Collaboration:
Today, companies – large and small – can select Workflow Management Systems (WfMSs) to support their business processes. When processes are critical, it is fundamental that WfMS infrastructures continue to provide pre-established service levels to users in the face of disruptions. Adaptation addresses precisely this issue. Current architectures do not incorporate adequate solutions that enhance WfMSs’ adaptation. In this paper we present a set of comprehensive techniques to be used in the development of WfMSs to increase their level of adaptation. We discuss how workflow adaptation can be triggered, which adaptation strategies can be applied, and why dynamic changes are in-dispensable to carry out adaptation. We not only target adaptation from a functional perspective, but also from an operational perspective. For the strategies presented, we describe their implementation to the METEOR WfMS.


N00173-98-2-L005 Sheth (PI) 07/01/1998 - 12/31/1999
NRL $330,542
Workflow Management for Advanced DOD Applications:
Workflow Management Systems (WfMSs) are used to support the modeling and coordinated execution of business processes within an organization or across organizational boundaries. Although some research efforts have addressed requirements for authorization and access control for workflow systems, little attention has been paid to the requirements as they apply to application data accessed or managed by WfMSs. In this paper, we discuss key access control requirements for application data in workflow applications using examples from the healthcare domain, introduce a classification of application data used in workflow systems by analyzing their sources, and then propose a comprehensive data authorization and access control mechanism for WfMSs. This involves four aspects: role, task, process instance-based user group, and data content. For implementation, a predicate-based access control method is used. We believe that the proposed model is applicable to workflow applications and WfMSs with diverse access control requirements.


Sheth (PI) 06/1996 - 05/1999
NIST-ATP Kochut, Miller, Shah (Co-PIs) $172,129
Collaborative Tele-Consulting for Healthcare (CaTCH):
This project involves integrating multimedia patient data on Intranet, medical reference data on Internet, LAN/POTS/ISDN-based Video+Data Conferencing, and WWW/Java programming to set up remote environment and context sensitive collaboration. The Medical College of Georgia is our primary application partner for this project.


95-FI38400-000 Sheth (PI) 04/15/1995 - 08/30/1997
Community Management Staff (CIA) $197,030
InfoHarness: A Scalable System for Searching Heterogeneous Information:
InfoHarness supports the search and retrieval of heterogeneous information in intranet/Internet environments. The basic InfoHarness system, and the corresponding commercial product Adapt/X Harness from Bellcore, provides access to heterogeneous textual and semi-structured data without restructuring, reformatting and relocating the data. It also supports logical restructuring of the information space, support for multiple third-party indices and many other features.

70ANB5H1011 Sheth (PI) 03/1995 - 08/1999
NIST-ATP Miller, Kochut (Co-PIs) $1,674,729
METEOR: Healthcare Information Infrastructure Program—Enterprise Workflow Automation:
The METEOR system focuses on R&D of innovative Multiparadigm Transactional Workflow Management technology. Workflow management techniques and systems developed in this project support coordination of user and automated tasks in real-world multi-enterprise heterogeneous computing environments over Web and CORBA+Java infrastructures. Technical issues involve GUI toolkit for workflow design/monitoring/simulation, automated code generation for distributed workflow application from graphical design, scheduling, EDI, multidatabase access, error/failure handling and recovery (using transactional concepts/techniques) as appropriate, and security.


METEOR-S: Semantic Web Services and Processes:
The growth of Web services and service oriented architecture (SOA) offers attractive basis for realizing dynamic architectures, which mirror the dynamic and ever changing business environment. With the help of industry wide acceptance of standards like Business Process Execution Language for Web Services (BPEL4WS), Web Service Description Language (WSDL) and Simple Object Access Protocol (SOAP), Web Services offer the potential of low cost and immediate integration with other applications and partners. The METEOR-S project at the LSDIS Lab, University of Georgia aims to extend these standards with Semantic Web technologies to achieve greater dynamism and scalability. Specifically, [Verma et al., 2004a; Sivashanmugam et al., 2003] focus on adding semantics to WSDL and UDDI (this work termed WSDL-S is being provided as input for next version of WSDL that will support semantic representation), [Verma et al., 2004b; Sivashanmugan et al., 2004] focus on adding semantics to BPEL4WS, and [Patil et al., 2004] discusses a semi-automatic approach for annotating Web services described using WSDL.

SeNS (Semantically Enabled Networking and Services):
This project seeks to take semantics to middleware and network level, starting with the definition and prototyping of semantic overlay network for scalable information dissemination.

ASD (Active Semantic Documents):
The LSDIS lab's collaborative research project on Active Semantic Electronic Patient Record with the Athens Heart Center (AHC) exemplifies an implementation of ASDs in a healthcare (more specifically cardiology practice) environment. It has so far involved: (1) the development of populated ontologies in the healthcare (specially cardiology practice) domain; (2) the development of an annotation tool that utilizes the developed ontologies for annotation of patient records, and (3) the development of decision support algorithms that support rule and ontology based checking/validation and evaluation. The most important benefit we seek from ASEMR (with its proactive semantic annotations and rule-based evaluation) is (1) the reduction of medical errors that could occur as an oversight, (2) checks such as preferred drug recommendations lead to prescription drug savings for patients leading to improved satisfaction, and (3) assistance in choosing the medically appropriate ICD-9 code could lead to less communications with the insurer and faster payment.

Semantic Middleware:
This project is doing fundamental research and developing core technologies related to the challenging problems of entity and relationship identification/recognition, extraction/annotation, entity resolution/disambiguation, matching, mapping, and rule processing. An interim outcome is the concept of Active Semantic Documents, which is already deployed as an Active Semantic Electronic Medical Record application at the Athens Heart Center.

CRW165049 Sheth (PI)
Model Generation and Model Extension for Information and Knowledge Management:
For this proposal three main points were identified: 1) Building a domain model based on the users interest; 2) Identifying a concept descriptions vs. concept mentions in text; 3) Focused search and browse guidance. The work on Domain Taxonomy creation was mostly finished prior to the project start. However, for the purpose of finding inter-entity relationships, the researchers found it useful to create more entity-heavy models than were created in Doozer or Taxonom.com. For this reason we modified Doozer to explore more Wikipedia pages and cut fewer of them, if there was still enough evidence that they were in the desired domain.

HP Live Information Management:
We propose a framework that, either visibly or invisibly to the user, allows guided search and browsing as well as classification and selection of information to improve information processing and knowledge sharing. For this, we identify three main requirements: 1) a domain model based on the user's interest, automatically generated on the basis of a simple domain description, 2) identification of concept descriptions vs. concept mentions in text, and 3) a focused search with browsing guidance. User interest is highly individual, so the more focused it is on a particular area of interest, the less likely a generic model will match any given user. Still, many topics of interest are based on well-known concepts described on some Web pages, often on collaboratively authored content such as Wikipedia. When we can identify by user behavior or explicit queries that correspond to the Wikipedia concepts of interest to individual users, we can carve out a part of Wikipedia (a form of relevant collective intelligence) that reflect their current interest. This is a form of blended semantic search and browsing that takes our past efforts into personalized form while leveraging collective intelligence.

Professor Amit P. Sheth
Artificial Intelligence Institute
Department of Computer Science & Engineering
University of South Carolina