|
Work
at TRDDC is focused in different groups, each specializing in a key
area. Projects of an interdisciplinary nature are also carried out. With
expertise in process engineering, software engineering tools and
technologies, advanced techniques, and in systems engineering
methodologies, TRDDC provides solutions within TCS and for major
clients.
The
research work is academically rigorous; researchers from TRDDC regularly
present their work at international symposia and publish their papers in
reputed journals.
For the year
2006-07, TRDDC plans to offer projects for final year students of
engineering in the following three areas:
|
1. Consistency Checking of
Requirements Document |
A
requirements document consists of a data dictionary modeled as a
business entity diagram and a detailed description of the business
operations written in English. The objective is to check that the
words used in the English description are consistent with the
words in the data dictionary. This requires identifying nouns and
verbs in the English text and checking if an entry exists for the
noun or verb or its variant in the data dictionary. The tool
should also check if a verb has been used consistently.
|
|
2.
Development
of predicate abstraction prototype tool for C programs |
Many of
the software engineering requirements emanating from quality
assurance require some sort of model checking over the source
code. Given an infinite state space of a general program, it is
impossible to perform any model checking. For this purpose, an
abstract of the program is required with respect to the properties
of interest. The objective of this project is to create an
abstraction of a given program (which will be another program)
with respect to properties of interest represented as predicates
over program entities.
The tool is expected to have input as the
predicate in which one is interested.. These predicates would be
defined in terms of existing program variables. The object is to
generate another program, which is equivalent to the original one
with respect to the predicates. That is, every possible path in
the original program corresponds to a path in the generated
program and predicate values computed in both cases are the same.
The project will involve using our existing static data flow
analysis capabilities to come with an as precise abstracted
program as possible. This will be on similar lines as the c2bp
tool of SLAM tool kit of Microsoft.
|
|
3.
Design Abstractions |
To
represent software systems as abstract design models (devoid of
program details) is a known but complex problem. With a focus on
legacy systems, we are working towards defining and arriving at
abstractions that represent different aspects of software systems.
Example abstractions for online programs are GUI, Navigations,
Services, Calculations. We aim to extract the abstractions as much
automatically as possible. Given the source code of the system,
challenge is to use different program analysis techniques (static
data-flow analysis, constraint analysis, etc.) and combine them to
build the abstractions.
Aspects
of a software system that we wish to explore and build prototype
extraction tools for, in this year are:
- Use-Cases: With a basic conceptualization of ideas
completed, we wish to define extraction methods based
on static program analysis and implement them to develop
prototype tools.
- Components: Using prototype tools of design model
extraction, we wish to explore the re-factored ‘design’ of
a legacy system to identify component boundaries.
|
|
4.
Java Program Analysis |
To
support continuous evolution of Java programs and prevent them
from becoming legacy, the programs must continue to exhibit
certain properties, for example, modularity, security, and
performance. We propose to apply program analysis techniques on
Java programs to identify programming patterns that exhibit such
properties. With the evolved (modified) program, we would like to
check and measure these properties, and ensure that the properties
are retained.
The
specific work areas that we would like to explore in this year
are:
- Java analysis: Specialize the generic program
analyzer to analyze Java programs for control flow and
data flow properties.
- For few specific properties of Java programs in
the area of security, we would like to build prototype
tools to measure the properties and compare them for
multiple versions of a set of Java programs.
|
| 5.
Implementation of an architectural framework for semantically
correct systems integration |
Enterprises
are witnessing increased thrust on collaboration and integration
of existing applications to provide value-added services across
the entire supply chain. Future enterprise systems are likely to
be assembled from customized, off-the-shelf offerings and
harvested legacy systems into a service-oriented architecture.
Traditional organization of an enterprise, as a set of
functionally distinct departments, leads over a time to a set of
isolated applications providing point solutions each constructed
for a specific purpose with context-specific built-in assumptions
hard-coded in their implementation. We propose an architectural
framework for integrating these disparate applications into a
semantically consistent framework. The framework is based on a
component abstraction that augments the existing component
abstraction with data models, process models, constraints,
assertions, and pre and post-conditions. The framework has three
layers:
-
Enterprise
layer that specifies the desired integrated system,
-
Application
layer that specifies existing applications being integrated,
and
-
Integration
layer that specifies the integration requirements of these
applications.
A set of properties that need to be satisfied for
semantically correct integration are proposed along with a set of
verification techniques. The proposed architecture provides a
foundation for a systematic method for executing systems
integration projects.
|
| 6.
Business IT Alignment |
Information
Technology is meant to deliver the information systems needed to
achieve business objectives of an enterprise. The business domain
is made up of sub domains, namely, its strategies and
infrastructure to implement those strategies. Similarly the IT
domain is made up of its sub-domains—its strategies and
information systems—which implement those strategies to support
business. These domains must be aligned so that they support each
other and hence business objectives can be achieved effectively.
One of
the reasons for not having this alignment today is that their
specification does not exist in a single container and does not
have bi-directional trace-ability. So having a single container
with all specifications and with bi-directional trace-ability will
help better alignment. However the actors operating within these
domains have different needs and competencies. The challenge is to
build a single specification mechanism with bi-directional
trace-ability with usability for different kinds of actors. Ajax
Web development technique provides a way to develop such a
specification mechanism.
This project would involve building a prototype
Web application to capture models of multiple domains with
bi-directional trace-ability, using Ajax approach. The domains of
interest are business infrastructure domain and information
systems domain. The modeling notations to be used for each one
need to be fixed as part of the project. The required
trace-ability and usability needs will have to be defined and the
Web application implementing them should be developed.
|
| 7.
Player performance tracking system |
This
project involves developing an application which implements the
methodology devised by TRDDC for evaluating player performances in
cricket. It involves using ball-by-ball statistics available as
public-domain information and converting it to player performance
indicators. It also includes development of new algorithms for
normalizing these indicators over time and across matches. Another
important aspect of the project is developing intuitive
visualizations for these performance indicators.
Skills required/to be learnt: C++/JAVA
programming, visualization technologies (e.g. Flash), statistical
analysis, technical report writing.
|
| 8.
Intelligent data analytics backboard |
This
project involves the development of a generalized architecture for
facilitating data analysis. It aims at providing a visual
application to the data analytics experts for rapidly analyzing a
given set of data. It involves the development of a formal syntax
for forming analytics execution chains, their validation, a
drag-and-drop interface for data analytics library functions
blocks (some of which are already available) and more importantly,
an advisory system for the data analytics expert for suggesting
the next steps given the nature of the data.
Skills required/to be learnt: C++/JAVA
programming, statistical analysis, artificial intelligence
techniques, formal specification, technical report writing.
|
| 9.
Parallel Support Vector Machines using high-performance computing |
Current
software and/or hardware configuration limits the use of high
computations and storage handling for building large SVMs. This
problem deals with leveraging the state-of-the-art in the area of
high performance computing (HPC) to address the computational
limitations of SVMs. The solutions include parallel processing of
SVMs, multi-threading, use of new software and hardware resources
such as different compilers, platform-specific optimized
libraries, etc.
|
| 10.
Data reduction techniques for faster SVM model selection |
This
problem addresses the time taken to conduct model selection for
SVM. The long time taken for model building is a limitation
particularly in applications where large-scale databases are
frequently updated. Down-sampling the large data and/or limiting
the search space for parameter optimization are commonly used
approaches to speed up model building. However, the speed-accuracy
trade-off needs to be investigated. This problem would be aimed at
investigating various techniques for data reduction to expedite
model selection.
|
| 11.
Data visualization |
This
is a general data analysis problem that involves investigation of
various techniques used to display and exploration of
multidimensional data. This would also include researching various
tools available for data visualization. There are two broad
approaches to visualization of data. One approach deals with
aggregation of elements in the data into some new information
using methods such as PCA, hierarchical clustering etc. The second
approach deals with mapping the data elements to a two or
three-dimensional space in some way. We are more interested in the
second approach to data visualization. The task would be to
research few such techniques.
|
| 12.
Dimensionality reduction and variable selection |
This
problem investigates various techniques to reduce the
dimensionality of data by retaining the most valuable variables.
Choosing the most important attributes is key to building an
efficient classification or regression model. The task would be to
implement techniques like Principal Component Analysis (PCA),
correlation etc., and analyze their performance by comparison on
high-dimension data.
General relevant areas: Machine Learning, Pattern
Recognition, Artificial Intelligence, Data Mining, Mathematical
Modeling, Data Visualization, Parallel Processing, High
Performance Computing, Cascading, and Multi-Threading.
|
| 13.
Integration Verification |
Consider
an application A that wants to use component B. We are concerned
about how to verify that A is using B properly and that the two
are integrated properly. Whether B has been tested properly, or
whether A functions correctly are separate concerns and are not
addressed in this note. Moreover, it is assumed that adequate
methods have been used to test the component B before it is
actually used to with application A.
Initial,
we only look at the situation where functions in A invoke
functions in B. For now, we do not look at situations where A and
B only communicate through a common state. We also assume that B
in turn does not call functions in A. These dimensions of the
problem will be considered after we have addressed the basic
problem first.
The
following are some of the common problems in integration:
-
Functions
are invoked without proper parameter values
-
Function
results are not interpreted correctly
-
Errors
and exceptions are either not handled or improperly handled
-
Called
function behaves differently from what is expected by the
caller
-
Functions
are not always invoked in proper sequence –leading to
unexpected behaviour – often not repeatable
The
impact of such problems is felt in many ways:
-
Defects
may appear unexpectedly – even after an exhaustive test
-
When a
defect is encountered, it is not easy to identify whether it
is due to a fault in the calling program or the called
function
When the defects are due to incorrect calling
sequences, the cause of the problem is hard to locate.
|
| 14.
Data Generation for Load and Performance Testing |
Load and
performance testing requires very large amounts of data – in
terms of the database as well as inputs. This data has to be
consistent so that the application under test will process it
correctly. Also, the data has to satisfy a target profile (for
example, in an hour there are 20000 withdrawal transactions, 5000
balance inquires, and 10 cheque book requests where there are 2
million accounts). Currently, this set up process can take several
months. The objective is to reduce this time period and effort
significantly.
This
project will explore methods to set up such a synthetic database
that has data consistent with the application needs and is
according to a given profile. Typical test data generation
methods will not work in a real life situation. There is no
product in the market that can do this.
This project will give very good exposure to
databases, and needs of performance testing in a real life
environment.
|
| 15.
Stub Generation |
Very
often when we are testing our piece of code, we need someone
else’s code, which may not have been developed yet. So, instead
of waiting we would like to create a stub for the unavailable code
and use it in our testing.
The key challenge in this project is how do we
generate stubs that behave intelligently as though the actual code
was available. How do we write simple specifications for a stub to
meet our testing needs? This builds upon our earlier work in test
data generation.
|
| 16.
Mock Database |
Imagine
that you are a developer using MasterCraft or any model-based
development tool. Lets say you have modeled your classes, written
your queries and are ready to test your services or functions
written in Java.
Are you
faced with following issues?
-
Do you
have to wait long for someone to setup a database environment
for you?
-
How do
you ensure that the data created in the database is done
correctly?
-
Do you
have to spend a lot of time to restore your database after one
round of testing?
-
How do
you avoid conflicts with other people testing?
-
Do you
have a hard time creating data to satisfy all possible test
conditions?
-
Are you
able to test your exception handling mechanisms well?
What if
you did not have to setup a database, but your database queries
returned meaningful results as you test the service or function?
Also, what if a log is created of what database reads and writes
took place to make it easy to verify what the function did.
Perhaps the initial testing that a developer does
could be completed very rapidly. Perhaps a more elaborate
integration type of testing could be done in a real database
environment.
|
|