| |
Make Room
for the Monster Databases By Joe McKendrick
They're big, and
they're getting bigger. New statistics out of Winter Corporation - an
analyst firm that tracks database size on an annual basis - find that
databases are poised to hit the 100-terabyte mark within the next two
years. Remarkable, considering that just a year ago, the largest databases
on record were between five and ten terabytes in size. ...Some of the biggest databases are growing by a factor of 20,
Richard Winter, CEO of Winter Corporation, told DBTA. "They're projecting
growth in databases by 2004 that are in the range of 100 terabytes of
data," he said. Even more remarkable is the fact that Winter is only
measuring actual data stored in his annual survey - not system overhead. A
hundred terabytes of data "easily equates to more than 500 TBs of disk,"
he states. ...Much of this growth is around
decision support functions, according to the latest Winter survey. "We
found that the average growth projection for decision support databases
over three years was 169 percent . For transaction processing systems, the
rate was 124 percent ," Winter said. "On the average, decision support
databases were a half a terabyte larger than operational systems," he
added. The Winter survey also calculates that such databases will grow to
more than two terabytes larger. "Decision support databases are growing
much faster than transaction processing databases - they're almost
tripling in size over the next two years." ...At the core of this decision support, of course, is customer data.
CRM was the principle application among the very large databases measured
by Winter, followed by e-commerce. Interestingly, most of these
applications for very large databases were custom developed by the
end-user company. For example, the survey found that 41 percent of CRM
solutions were custom developed, versus 7 percent acquired through a DBMS
vendor, and 7 percent through anther third-party application. Another 45
percent were not sure of the source of their CRM applications. ...Of course, most companies don't have multi-terabyte behemoths on
site. In fact, most are still working on reaching their first terabyte. In
the mainstream, typical large databases range anywhere from 100 GBs to 500
GBs in size, Winter reported. Such database power is in evidence at
Foxwoods Resort Casino in Connecticut, the nation's largest casino
complex. The resort casino, owned and operated by the Mashantucket Pequot
Tribal Nation, relies on a 300 GB database for both marketing and casino
financial information. The system, which runs on a Progress database,
currently supports about 20 million transactions a day, according to Todd
Williams, MIS database and DSS manager for Foxwoods, in an exclusive
interview with DBTA. ...The casino's player management
system is a "customer tracking and loyalty program the business users to
track activity, better understand profitability, and better manage those
key customer relationships," said Williams. "Much of the core application
database consists of millions of patrons, and hundreds of millions of
player rating transactions. These ratings help track patrons' gaming
activity which provides insight into the Marketing Initiatives. Those
transactions also feed into a loyalty program that allows patrons to earn
comp points. So a certain amount of gaming play gets rewarded with
'points' that can be used at retail shops, restaurants, and for many other
casino offerings." The other half of the database supports many other
casino operations, including financials, Williams added. The Progress
database system runs on IBM pSeries hardware running AIX. ...
As found in the
Winter survey, the need for customer insights drives a number of large
database deployments. For example, GHS Data Management, a service bureau
to state agencies and insurance providers in Maine, relies on a 100 GB
database to monitor prescription benefits. GHS provides pharmacy benefit
management for the private sector and MaineCare, the state's public
healthcare assistance program. The data warehouse itself is supported on
an IBM pSeries processor running a Pick database on AIX. The front-end
data analysis portion, provided through Databeacon, runs on Microsoft SQL
Server on Windows 2000. ..."For transactional processing,
we store everything in a normalized OLTP structure, for our baseline
format," says Jason Skeffington, project manager at GHS. "We then build
star schemas and cubes from the relational structure. We have analysts
here who do heavy-duty querying and aggregation that create standard
reports that we publish online to our clients." Data analyzed helps ensure
that clients are able to purchase prescriptions in the most efficient
manner possible. "We provide an online, real-time, claims adjudication
system," Skeffington told DBTA. "When a pharmacist enters a patient's
information into his PC to process a prescription, we're at the other end
of the pharmacy computer. Our system takes care of client info, drug
pricing, co-pays, prior authorization, and pharmacy billing." ...
So far, there have
been few performance issues with the large amounts of data flowing through
the system, which consists of an eight-way processor with a half-terabyte
ultra fast SCSI back-end, according to Skeffington. "Up until a year ago,
we had a smaller dual-processor server with a quarter-terabyte back end.
That was doing okay, but we could see down the road that if we wanted to
do heavier reporting and launch an online decision support system, we
needed to ramp up." ... At GHS, the benefits were most
immediately apparent in terms of IT staff productivity. "Customers hardly
talk to me anymore; they just go online," Skeffington said. That's fewer
queries he has to make, fewer reports he has to write, and, therefore,
less money he has to spend, he pointed out. Maine state officials are also
able to analyze spending trends online, and thus make appropriate
adjustments that could save taxpayer dollars. ...The benefits incurred from large database deployments can be felt
across sponsoring organizations. In the Winter survey, 26 percent of
respondents cited increased performance and the ability to meet workload,
growth, scalability demands. Another 14 percent cite the single view of
enterprise data that such a database provides, while 13 percent credit
their deployment with making IT a competitive and business asset.
... Ace Hardware, a chain of
independently owned hardware stores, has been tracking the metrics of its
own large customer data warehouse implementation on an NCR Teradata
system. The company reports that its analysis to create promotional
campaigns "takes hours instead of days," and found a positive response
rate (decision to buy) doubled from 5 percent to 10 percent . The data
warehouse also made the company's first-ever national sales campaign
possible, said Diane Flynn, data warehouse manager for Ace. "Customer
service and neighborhood convenience differentiate Ace independent
retailers from our 'big-box' competitors," she explained. "And to continue
to build on those factors, we knew we needed a data warehouse that would
enable us to store and analyze transaction data for insight into customer
preferences and behaviors." ... While there are many technical
challenges to address in deploying large databases, the most daunting
obstacles are political ones, the Winter survey found. Almost a third of
the group say organizational issues hampered the progress of their
projects. "The most widespread obstacle was getting the buy-in from all
levels of the organizations," said Kathy Auerbach, vice president of
Winter Corporation. "You have to have the users buying in to saying, 'yes,
these are the requirements we need, and when you give us this system,
we're going to use it, and we're not going to hold back and do things the
old way.'" ...For instance, political issues
played an inhibiting role at one Midwestern retailer that had grown its
database to 30 TB, said Debbie Smith, database analyst with NCR Teradata
and former database administrator at the site. While the system grew in
leaps and bounds, it was sometimes a tough sell to management to justify
ongoing improvements. "One year, we tripled our size of data, filled it
up, and the next year we had to double it again," she related. "We were
good at capacity planning, but we just didn't communicate it effectively."
The key to such communication is documenting ROI as the database grows,
Smith explained. "By not having ROI, it is difficult to present the
business case to the business users. Technical details about costs of
upgrades to support number of queries, it doesn't mean anything to them.
All they should care about is if they're getting their information in a
timely manner." This is new territory for many companies, she added. As
databases continue to grow beyond their current boundaries, the largest
deployers will continue to blaze new trails in terms of garnering
organizational support, she said.

University of District of Columbia Opts for
OpenInsight By
Kara Kridler
It's called
seeding the market. Software companies work hard to get their products
into the hands of college students for one simple reason: The tools they
learn in school today will be the tools used in the workplace
tomorrow. ...Revelation Software has now
joined the ranks of database vendors whose products are taught at the
university. Next semester, computer programming students at the University
of the District of Columbia (UDC) will learn the basics of database
programming using the company's flagship OpenInsight applications
development environment. ... Carl Friedman, an associate
professor in the Computer and Information Systems department at UDC,
decided to use OpenInsight. He was first introduced to Revelation in the
1980s. At that time, he was doing a great deal of consulting for city
government agencies in Washington D.C. His clients were looking for a
database program to analyze data. Friedman attended a presentation where
the company was offering its Advanced Revelation multivalue database,
which, at the time, ran under DOS. He tried the free tutorial and liked
what he saw. ... "It was a fantastic package,"
Friedman said. "I toured through the whole tutorial in a weekend. It just
beat everything on the market. There was nothing close to it. Even today,
it cannot be touched. I was in love with it," he said. ...Friedman has stuck with the product through a series of
corporate twists and turns over the past 15 years. Mike Ruane now heads
Revelation Software, whom Friedman described as a "super programmer." The
company unveiled its latest version, OpenInsight 4.1, this fall.
OpenInsight is a complete application development environment with a
multivalue database as its centerpiece. ...With
the new release, Revelation has shed its database of much of its original
DOS origins and the support for some of the features he uses has been
diminished, Friedman said. Nevertheless, he said, the overall package is a
"developer's dream" and is still the best database program available. He
was been working with Ruane and Revelation to develop strategies that will
help OpenInsight find a wider audience. People who are unfamiliar with
multivalue database concepts have to go through a learning curve, he
noted. ... As part of his commitment to
broadening the user community for OpenInsight, Friedman has decided to
offer a class to his students. The older version of Revelation could not
reasonably be taught in a world dominated by Windows-based programs, he
noted. The earliest Windows-compatible versions were also not appropriate.
But with the latest version OpenInsight, Friedman felt that it was time to
take the plunge and introduce multivalue database programming to his
students. ... Friedman will use the
program in a class called Database Programming. The majority of Friedman's
students are seniors, who have already studied several different
programming languages. Students enrolled in the class must be familiar
with the commonly used databases, such as Oracle, IBM DB2, MS SQL Server
and MS Access. "I consider OpenInsight the best multivalue database on the
market today and that is why I had no hesitation teaching it. The tools
that come with it also work just handily on Oracle and Lotus Notes,"
Friedman said. ... Even though Friedman is focused
on the multivalue aspect of the program, he is enthusiastic about the
other skills his students will obtain. "Everything they [his students] are
learning, except the actual multivalue aspects themselves, can be
transferred over to all of the other major database systems," he said.
Friedman added that the most important aspect of using OpenInsight in his
class is that when his students are finished with his class, they will
have marketable skills regardless of what they encounter. ... The
intention is for OpenInsight to help his students develop tools that the
other programs do not supply. "You can run the tutorial and after running
the tutorial you can put together an application," Friedman said.
OpenInsight should be a good match for UDC students since they tend to
remain in the DC area after graduation and there is an existing demand in
Washington for these programming skills. Friedman hopes his students will
have a competitive edge for those jobs. His students will have an
advantage over employees that do not have OpenInsight experience. Without
experience, employees will have to be sent away for training. This is
timely and costly to the employer. ...Friedman said that most
important factor in his decision to introduce OpenInsight in the classroom
was because he strongly dislikes inefficiency. He wants to see more people
doing things the fastest way. "If I do my job well, I expect my students
to be in a position, within a shorter time than say other schools because
I have older students, to have some influence in what software selections
are made and OpenInsight may be then become part of the decision mix or
the solution mix in their organization," Friedman said.
Kara
Kridler is a freelance writer living in Washington D.C.
back to top 
Tellabs
Copes with Database Proliferation by Walt Jordan
Tellabs is a major
provider of telecommunications infrastructure equipment. It manufactures
cross connects and switches for a wide range of customers, including
AT&T and the major Internet service providers. ...Over the past several years, the data management
challenges at Tellabs, which has more than 5,000 employees, has increased
dramatically. In fact, when Praveen Gautam joined Tellabs in March 2000,
first as a database administrator, and then as the manager for global
information services, the company had five databases. Gautam and three
associates could manage those databases using in-house tools. Today,
Gautam and his group, which consists of three teams, manage more than 100
databases. DBTA editor Walt Jordan talked with Gautam to understand how he
does it.
Jordan: Who is
in your group and to whom do you report? Gautam: I have three
teams-three SAP ERP administrators, four database administrators and two
batch-and-print administrators working with Tivoli. I report to the
director of North American operations.
Jordan: What
was the situation when you started at Tellabs? Gautam: I was
hired as a DBA. At the time, Tellabs was starting to work with Oracle. We
had SAP running on Informix, which was our biggest database with six to
seven terabytes of data. Our technical architecture group decided to
standardize on Oracle for all our new implementations. And one year, the
number of databases grew. Now there are close to 100 Oracle
databases.
Jordan: What
led to that proliferation? Gautam: At the time, the economy was
doing well, and Tellabs was growing at a rapid rate. We were implementing
a portal using Broadvision. We were implementing Clarify and Documentum.
All those applications were coming into production. And there were a lot
of in-house initiatives going on. For those, Oracle was chosen to be the
back-end database.
Jordan: Who
managed the databases? Gautam: There were two DBAs for the
Oracle databases and two for the SAP implementation Informix.
Jordan: Your
responsibilities were? Gautam: We had to support our production
environment. We had to support our application teams for the new
databases.
Jordan: How did
you do that? Gautam: When I joined Tellabs, there were no
standards and no tools to manage database administration. I was doing
things at the command line. But when the work began growing so rapidly, we
didn't have the luxury of working at the command line and making mistakes
and losing time.
Jordan: So what was
your strategy? Gautam: I was experienced with DBArtisan from
Embarcadero Technologies. I liked it because it allows you to do a lot of
the things that you do at the command line, but there is little chance of
making mistakes. So I thought that would be a good approach.
Jordan: How did
you make the decision to add the tool? Gautam: After six months,
I was promoted to be the team leader, and I was asked to cross train the
others in the group to become Oracle DBAs, because the number of Oracle
databases was growing.
Jordan: What
was the training process? Gautam: I sent them to formal Oracle
training, but they were still worried that they didn't have enough
experience to jump into a production environment. They knew what to do,
but they didn't know exactly how to do it. So when they saw DBArtisan, it
was a great help to them. DBArtisan gave them a good way to navigate
through the print schemas and objects in Oracle. Before they would execute
a statement, they could see the SQL running behind it. That was a big
asset. They were able to get up to par quickly. Within two months, I could
put them on the production support rotation.
Jordan: So you
realized two benefits. Your team made fewer mistakes, and you were able to
move people into a production environment more quickly. Gautam:
Right.
Jordan: Did you
have to sell the idea of buying database administration tools
internally? Gautam: When I saw the number of databases growing
rapidly and a lot of new applications on the horizon, I talked to my
manager and told him how difficult it was for DBAs to work at the command
line and the risk of mistakes and losing time and losing important data. I
told him that we could not work that way in a production
environment.
Jordan: Was
cost an issue? Gautam: The cost is peanuts compared to what we
paid for other tools. I did a demo showing the benefits. Then I did a
justification document showing the benefits.
Jordan: Have
you looked at any other tools? Gautam: We have BMC for our
monitoring infrastructure, and we have installed some of those tools. And
we have Oracle tools through our enterprise license. I use Oracle for the
export and import of information. But I have found DBArtisan to be the
most flexible tool.
Jordan: Is
there anything else on your shopping list? Gautam: At this
point, I don't see the need for any other tools.
Jordan: When
you first came in, the number of databases was exploding. Now we are in a
period of consolidation. How have you managed that? Gautam: Even
though the scope of several projects was reduced, the databases still have
to be managed. I haven't had to delete a single database or take any off
support because an application has been canceled. Although the number of
databases has stopped growing, I still have 100 databases to manage.
Jordan: So what
is the day-to-day situation? Gautam: A lot of development work
is still going on. We get a lot of requests on a daily basis to make
schema changes and to import and export refreshes. The 100 databases are
on 50 servers, and it is impossible to log onto each box. DBArtisan has
given us a central location to manage that.
Jordan: How do
you manage the infrastructure? Gautam: We have a database to
manage the other databases. We have scripts that feed information into our
central database, which is the repository of all our jobs information. We
query that every day and have a GUI interface. To edit that, we use the
edit function in DBArtisan. Before, I had to write SQL
statements.
Jordan: How
many more databases could you manage before you would have to add to the
head count? Gautam: I think that we could grow about 20 percent
before we have to add more people.
Jordan: Do you
plan to add any new database platforms? Gautam: We do have two
Microsoft SQL Server databases and a Sybase database. We are developing an
enterprise license with Microsoft, so I see the number of SQL Server
databases picking up. But we can manage that with DBArtisan as
well.
DBTA: What kind
of challenge adding different database platforms present? Jordan:
We will have to cross-train people about how SQL Server does things
differently than Oracle. But the basic database administration concepts
are the same. I think I will be able to do it with internal cross
training.
Walt Jordan is a
regular contributor to DBTA; write him at Walt@dbta.com.
back to top 
DB2
Version 8.1 Goes to General Release by Billy Rosario
IBM announced that DB2
Version 8.1 went into general release in late November. The release
represents an important milestone in the technology roadmap IBM has laid
out for its data management infrastructure solutions and autonomic
computing initiatives. The new database software is designed to help
companies simplify and automate many of the tasks associated with
maintaining databases, as well as delivering the broadest support for open
standards, enabling customers to manage, integrate, and analyze
information from the widest variety of sources to gain a greater return on
their investment. ...To understand what the general
release of DB2 Version 8.1 means to IBM and understand the company's
vision for the future, DBTA contributing editor Billy Rosario talked to
Dr. Patricia Selinger, IBM Fellow and vice president of data management
architecture and technology. A pioneer in the field of relational
databases, for 12 years, Selinger directed IBM's Database Technology
Institute. In 1999, she was elected to National Academy of Engineering for
contributions to the field.
DBTA: What are
you most excited about in this DB2 release? Selinger: There are
number of things that are important to our customers. We have a number of
technology pieces in Version 8.1 that really help cut down the amount of
time a database administrator has to spend maintaining and tuning the
database.
DBTA: Why is
that so important? Selinger: As databases grow in size and the
need to connect data together grows to keep companies competitive, a DBA's
job keeps growing. Anything we can do to help with the total cost of
ownership by reducing the amount of time that a DBA takes to do certain
tasks or eliminate them altogether, that is a wonderful thing. Look at
something like the Configuration Advisor in Version 8.1. This is a set of
questions that a DBA spends about 20 minutes answering. We then come out
with recommendations based on expertise built in by experts at our
development labs and our performance team for the configuration
parameters. We have tested this against the tuning we did for an OLTP
benchmark and we came within 91 percent of the experts. For 20 minutes of
work compared to weeks or months of work by the experts, this is a
substantial savings.
DBTA: But
still, is it good enough? Selinger: For many of our customers,
it is very close for the kind of workloads that they run. For the
customers who are on the leading edge and want that last ounce of
performance, this is a very good starting point. They can do hand tuning
from this point on. Some of our real customers got better performance than
from the experts.
DBTA: There is
some other new technology aimed at BI applications too. Selinger:
We have added multidimensional clustering at the physical storage
level, which nobody else has. We have applied for a patent for this
technology. It concerns the ability to store data and cluster it in many
dimensions at the same time. For any user request, you can go straight to
the data that qualifies for your query. It is an advantage for people who
build warehouses. With most warehouses, you are not absolutely certain how
people are going to query it. You get surprises. But rather than having to
reorganize your data, you can organize it across multiple
dimensions.
Want to learn about
Pat Selinger's view of the next stage of evolution in information
integration? Click
here.
back to top  |