Article
04/05/2007

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

ERwin 4.0 Reviewed

Paul Munkenbeck

Like many of us, Paul Munkenbeck is an ardent fan and long-time user of ERwin. So it's particularly disappointing to read his review of what seems to be a bug-ridden, premature release.

ERwin was originally developed by Princeton, New Jersey-based Logic Works as a data modeling tool for the then new breed of client/server databases. [Benjamin C. Cohen founded the company in 1988—reportedly using his credit cards—and, as the original programmer of the company's flagship ERwin product, helped redefine CASE (computer-aided software engineering) tools.—Ed.] ERwin had to evolve fast to keep up with strong competition from products such as Popkin's System Architect, SDP Technologies' S-Designor (subsequently Powersoft's and now Sybase's PowerDesigner), and Embarcadero's ER/1 (now called ER/Studio)—not to speak of new features in the leading RDBMSs themselves, but by the late 1990s, it was widely viewed as the market leader.

In 1998, Logic Works was acquired by Chicago-based tools vendor Platinum Technologies—a respected firm with a solid line of data-oriented products. Although Platinum promised continued research and development, that wasn't to happen because Platinum in turn became takeover bait. When Computer Associates acquired Platinum in 1999 for a record-setting $3.5 billion, ERwin was just one of many products CA inherited, and clearly not a strategic one. In fact, until the March release of ERwin 4.0, CA had been satisfied with releasing about one service pack per year (to keep those with annual maintenance agreements happy?). It will be interesting to see how much CA invests in ERwin support and marketing.

Frame of reference

In my firm, I want to use ERwin to create and maintain small to medium (in terms of numbers of entities) tactical databases on SQL Server 7 and 2000. Working for many different clients, I have no need for strategic top-down corporate modeling, so please forgive my lack of experience with the features of ERwin that support enterprise modeling and development. [ERwin 4.0 ($3,995 U.S.) is, in fact, also available as part of CA's ERwin 4.0 Modeling Suite ($6,995 U.S.), which integrates ERwin (http://ca.com/products/alm/erwin.htm) for data modeling, BPwin (http://ca.com/products/alm/bpwin.htm) for business process modeling, Paradigm Plus (http://ca.com/products/alm/paradigm_plus.htm) for component modeling, and ERwin Examiner (http://ca.com/products/alm/erwin_examiner.htm, $1,995 U.S.) for ensuring the consistency and quality of data models.—Ed.]

My approach to database evolution tends to be pragmatic: After generating an initial database design from business requirements, I then hand over to my developers and allow them to make minor structure changes directly on the development database. When a release is to go into production, I synchronize the ERwin model with the development database, compare it to the production database, and generate a change script that will apply the change as required.

To support this approach, the main features of ERwin that are important to me are 1) data modeling support for standard information engineering techniques, 2) database generation (forward engineering) that exploits the native features of SQL Server, and 3) a synchronization process (reverse engineering/compare) that recognizes and handles all of the features that might be present in a SQL Server database.

Installation and new features

ERwin 4.0 runs on any current 32-bit version of Windows and requires 64MB of RAM. I run it quite happily on a PII200. Various CA documents disagree on the amount of disk space required, varying from 42MB to 85MB, but I found it needed about 58MB. You do need to restart Windows at the end of the install. Version 4.0 can be installed alongside the previous version (ERwin 3.5.2), and both versions can be run at the same time. CA has already released the first "fix" pack, which you can download, along with a tutorial (a new 5.8MB HLP file that contains 10 lessons—don't ignore the "Methods" tutorial left over from Logic Works' days), from http://support.ca.com/erwin_supp.html. At 30MB the download is anything but trivial, but you have no choice, as CA told me it's not available on CD. You need the patch if you want XML support and to fix a number of GPFs and lesser bugs, so I applied it immediately. [The Readme file that lists the fixes also indicates that support for both VB and PowerBuilder was "stabilized at ERwin 3.5.2 and will not be provided in ERwin 4.0," that documentation for the ERwin 4.0 API won't be available until SP1, and that the ERX output has been replaced with XML. The API was available as of mid-June, however, in build 1338.—Ed.]

ERwin 4.0 includes a number of new features, and you can obtain full details in various PDF documents from the Computer Associates Web site. I've picked out some of the features that are important to data modelers and SQL Server DBAs and have looked at these in detail. The "Getting Started" contains some basic tutorials and useful tips, and ERwin also ships with a sample model called EMovies that's based on a movie rental business.

As far as I'm concerned, the best new feature is the Model Explorer, which gives you a tree-like view of all of the objects in the model (see Figure 1). If you wish, you can now edit and change model objects directly in this new interface rather than searching around for them in a large diagram.

To try out the new interface, I created a new model for a Web registration application. I first used the Domain Dictionary to create the domains (predefined combinations of datatype, default value, and validation rules) that I use for contact information such as Name, Address Line, E-mail, Telephone, and so on. I was impressed that even in the logical domain dictionary, ERwin 4.0 supports Unicode datatypes. However, even if it hadn't, I could have used the new Datatypes Standards Editor to create my own datatypes to be mapped to ERwin domains. I happen to use the IE (Information Engineering) methodology rather than ERwin's default IDEF1X notation, but it's easy to switch among methodologies. Once I'd configured ERwin for IE notation, I created new entities and set up relationships between them.

Design layers and model derivation
ERwin 3.5.2 supported a combined logical/physical model and allowed you to switch views between them. ERwin 4.0 now allows you to create independent logical and physical models, derive one or more physical models from a logical model, and link a model to its source so that changes can be synchronized and history can be traced. The idea behind this is to help you build your design in layers—a great concept. For example, I'm an advocate of a phased design approach: First produce a completely normalized logical data model, then convert to a first-cut physical model, next perform a second cut based on access analysis with indexes and denormalization, and finally, perform a third cut defining physical storage. The new ERwin design layer approach promised to help this significantly. Unfortunately, the reality was disappointing.

One of the features of this new approach is the ability to set rules for generating the names of your physical objects from their logical names. For example, you can add a prefix and suffix to each entity name when converting to a Table, or generate an index name from a logical key group. You can override the rules and set explicit names if you prefer. I took the sample EMovies logical model and specified a prefix of tbl and a suffix of s (for plurals) to be added to each entity name, and it worked fine when switching views in a combined logical/ physical model. However, when I used the Derive Model function to create a new SQL Server 2000 physical model, it resulted in some tables getting the prefix and suffix added twice, so STORE became tbltblSTOREss!

To test synchronization between a source model and a derived model, I created an updateable copy of the read-only sample EMovies model and derived a new logical model from it. I selected only a subset of the original containing just the subject areas for Customer and Movies. I changed the derived model by adding a new entity Customer Report as a child of Movie Rental Record and a new attribute Manager Discount to entity Movie Rental Record. I then attempted to synchronize this change with the original source. One annoying feature of the synchronization dialog is that long filenames aren't supported—the filename for my source model appeared as: C:\PROGRA~1\COMPUT~1\ERWIN4~1.0\SAMPLES\STANDARD\COPYOF~1.ER1.

More significantly, the synchronization dialog reported approximately 25 differences between the source model and the derived model! Many of these were changes I hadn't even made—the setting of the Key Groups Logical Only flag, unmatched UDP Definitions (they looked identical to me), and the order of attributes. Hidden among these bewildering and spurious differences were my new attribute and entity, so I selected these to be exported back to the source model. It appeared to work okay, as the changes were present in the source model when I re-opened it, but the Manager Discount attribute was in a different position in its entity.

What with the awkward interface and spurious differences reported, I'm not really comfortable enough with this to risk it in a real-world exercise.

Model history and traceability
According to the ERwin 4.0 documentation, "You can save historical information for your model, entities, attributes, Tables, and columns. You can track meaningful changes to derived and transformed models, as well as routine information about the creation and revision dates of your models using the History feature."

This just doesn't work as I expected. I set the properties of my models to record every possible change just before carrying out the preceding model derivation exercise. The only change recorded in history was the fact that I created a new entity in the derived model—there was no record of the new attribute, no record of synchronization back to the source model, no record of what the synchronization process did to the source model.

Transforms
There are nine new methods, known as transforms, that you can apply to a model. Transforms help you apply a design decision to your model and maintain a record of that decision, and there are both simple and complex transforms. Simple transforms include link column copy, supertype-subtype identity, and many-to-many association Tables. Complex transforms include Two Table Roll Up, Two Table Roll Down, Supertype-Subtype Roll Up, Supertype-Subtype Roll Down, and Vertical and Horizontal Partitioning. One of the big benefits of transforms is that you can add a description to the transform—explaining why you made the decision. This is invaluable for later support and maintenance of the model. For documentation purposes, you can produce a report containing details of all of the transforms and your explanations.

To try out the transform feature, I added a pair of tables to the sample EMovies model—implementing a simple parent/child relationship. I then applied the Roll-Down denormalization to combine all of the columns into the child. It all worked fine, and it's even possible to reverse the process if you change your mind. During the transform, I requested the option to synchronize changes with the source objects, which you can still see in the Transform tree in Model Explorer. However, it wasn't too clever at handling a new attribute called ChildUpdated that I added to the transformed table—it appeared with a name of _default_ in Model Explorer.

Another criticism I have with this is that in order to carry out a denormalization transform on a pair of tables, ERwin requires that the relationship be an identifying one—that is, where the foreign key is part of the primary key of the child. To me, this seems overly restrictive. After all, there are instances when I already have a unique primary key and simply don't want the foreign key added to it or its dependents. Yes, it is possible to work around this by 1) temporarily making the relationship an identifying one, 2) applying the transform with the option not to reflect future changes in the source model, and then 3) altering the foreign key afterwards to turn off the primary key property, but this is hardly intuitive!

Even though transforms don't allow you to do anything you couldn't do manually before, once some of the quirks are resolved, I feel they'll be a useful addition to the toolkit.

Naming Standards Editor

The Naming Standards Editor is a new tool for defining naming standards for entities, attributes, domains, tables, and columns. You can also create a glossary of words that can be used by the whole enterprise for naming the aforementioned objects and validate a particular model against your glossary.

ERwin considers an object name to consist of up to four parts: prime, class, and two modifiers. For example, in the attribute named Average Sales Total Year To Date, Sales is a prime, Total is a class, and Average and Year To Date are modifiers. You might want to define a standard so that the equivalent physical column name derived from this is, say, AVGSLSTOTYTD. Within the glossary, you can set up a list of words that can be used as primes, classes, and modifiers and provide two abbreviations that can be associated with each. Some words may be used as more than one component; for instance, Sales may be a prime or a class depending on context. You can either enter your glossary directly into the Naming Standards Editor or you can prepare it as a set of comma-separated values. A sample CSV is provided, and I successfully imported it into my glossary and saved my naming standards as an NSM file.

I then associated this file with my new logical model and initiated the dialog that validates the model against the naming standards. This is not a friendly interface, and every time you restart it, you go back to the initial defaults. If a word doesn't match the glossary entries, you can ignore it once—or forever—but you can't simply add it to the glossary from here. In fact, if you alter the glossary at all, you have to reload it into the model. You can choose whether to check entities, attributes, and/or domains, and you can choose whether to enforce positional standards. This latter option is very severe, as it insists that you use all of the name components and in the order defined. For my entity-naming standard, I'd defined prime followed by class, as some, but not all, of my entity names have classes. The compliance check does not allow optionality and rejected all of the entity names that had a class. To add insult to injury, the dialog reports non-compliances one item at a time—something that's not only impractical for a large model, but also extremely aggravating.

Overall, the Naming Standards Editor is a reasonable concept for a company that wants to set enterprise standards. Unfortunately, this tool hasn't been well thought out and is so awkward to use that many will be put off completely.

Forward Engineering, Reverse Engineering, and Complete Compare
Forward Engineering is the process by which a physical database is generated from an ERwin model by generating and/or applying a script. Reverse Engineering is the opposite process whereby ERwin examines a database or script file and generates an ERwin model from it. Complete Compare synchronizes an existing model with a database or script file. There are some new wizards that guide you through the process of importing model information into ERwin and exporting changes to a database or script file. The wizards in 4.0 step you through the processes more carefully than in earlier releases, and there's also a handy new filter that enables you to ignore annoying physical objects such as the dt_ stored procedures that get created by Visual Studio.

One minor but irritating complaint about these tools is that I can't get them to consistently remember changes I make to the default settings—they keep returning to their initial values even if I save the settings. (If you're comparing against a database rather than another model, you certainly want to turn off some of the comparisons to prevent objects from being reported as missing in the database—such as domains.)

I also have some very severe criticisms about the Reverse Engineering and Complete Compare tools and their support for SQL Server 7 and 2000:

ERwin has never supported reverse engineering of SQL Server security—for instance, users, roles, GRANTs, and REVOKEs. In my view, this is a serious omission and nearly a show-stopper. During forward engineering, you can achieve some functionality by using a Table Level Script macro; for example, GRANT SELECT ON [%TableOwner].[%TableName] TO username. It's easy to attach this script to all existing tables as well as automatically assigning it to any new tables. However, this isn't really appropriate in anything other than trivial security regimes. If a change script requires a drop and re-create of a table, view, or stored procedure, for example, then you have to manually determine the security that existed beforehand and put it back yourself afterwards.
ERwin 4.0 always reports that for every stored procedure, the code in the model is different from that in the database. I reported this to CA (it's extremely easy to reproduce), and it's been accepted as a bug. But how much testing has this product had if a core feature such as this has such a significant and obvious fault?
ERwin doesn't provide native support for the new declarative referential integrity (DRI) functions in SQL Server 2000. You can certainly set DRI properties such as cascade delete in an ERwin model, but these do not produce SQL Server 2000 DRI syntax. Again, you can write ERwin macros to generate the code from relationship properties, but ERwin doesn't necessarily generate the tables in the correct sequence to support this code. If you reverse engineer a SQL Server 2000 database with DRI already defined, ERwin fails to recognize it properly and sets the wrong options in the model. I've raised this with CA as an enhancement request, but it's disappointing that ERwin doesn't support such an important feature in the current version—indeed, one that's been shipping for more than a year and a half—of a major RDBMS.
Even more disappointing is lack of recognition for two features that were introduced back in SQL Server 7, namely computed columns and autostatistics. During reverse engineering, ERwin interprets a computed column as a normal column and then gets it completely wrong in any change scripts. And autostatistics are not real indexes (even though they appear in sysindexes with names like _WA_Sys), but ERwin interprets them as if they were and attempts to drop and re-create them as part of change scripts. This was a fault in ERwin 3.5.2 that I reported to CA more than 18 months ago and it's still present in version 4.0. All it needs to do is check for each index that INDEXPROPERTY(id,name,'IsAutoStatistics') isn't 1. Both of these faults have now been raised with CA as enhancement requests.

Report Builder
Reporting has undergone a complete revamp, and the new Report Builder functionality is a great improvement. You define templates for your reports, selecting from just about anything contained in your model, even the diagram graphic itself (at last!). CA even supplies some sample templates to get you started. You can then apply any template to the current model, and generate output in rich text, comma-separated, or HTML format. Running a report is amazingly fast and automatically opens the application assigned to the format chosen. The HTML option is particularly impressive, as the Table of contents appears in a frame with links to each section of the report as you can see in Figure 2. The only minor complaint I have is that CA put its company logo in the corner of the contents frame—still, I suppose you could locate the graphic file they use and replace it with your own.

Compatibility

ERwin 4.0 can read and save to ERwin 3.5.2 models, with the usual limitations that new ERwin 4.0 features can't be saved to the earlier release. However, Model Mart support won't be available until ModelMart 4.0, and, as mentioned earlier, the ERwin 4.0 API is still in beta. ERwin 4.0 has also lost some features that were present in earlier releases. In addition to dropping support for ERX and VB clients, for example, it also no longer supports the extremely useful ERwin Data Dictionary, which provided for storage of versioned models in a relational database rather than the file system. [CA has also opted to completely kill TestBYTES, the handy test data generator initially developed by Logic Works.—Ed.]

On a more positive note, CA has developed some exciting functionality to help diagnose and validate the quality of your design. However, rather than include this in ERwin 4.0 (where it should be), they've decided to sell it as a separate product, ERwin Examiner. If you do decide to purchase this, make sure you apply patch 1 to ERwin 4.0 to ensure compatibility.

Support

If you need technical support, you can visit the Computer Associates Technical Support Web site at http://esupport.ca.com. I've found this much more helpful than the British telephone support line. Initial responses are usually within a day, and a full technical response within a few days. This has certainly improved over the past 12 months, but I'd like to see a similar improvement in the speed with which enhancement requests and bug fixes make it into product updates!

Conclusions and coda

ERwin 4.0 has several successes: The Report Builder is excellent, the Workplace with Model Explorer is even more intuitive, and the conceptual changes are promising. However, these are outweighed by some serious negatives: reverse engineering, synchronization and complete compare which are full of bugs; model derivation and naming standards validation that are difficult to use properly; lack of support for some of SQL Server's newer features; and no support for SQL Server security. Parts of the ERwin 4.0 upgrade feel rushed and only half finished—which is indeed strange considering the time it's taken to deliver. [Readers might also want to read Rose Coder's review at www.infoadvisors.com/articles/ERwin40/erwin_40_test_report.htm.—Ed.]

Although it's a good tool for initial data modeling, I can't recommend the current version of ERwin 4.0 for the serious SQL Server DBA. I really hope CA will fix these shortcomings in my favorite database tool, but until it does, I, for one, shall be looking at alternative products such as Embarcadero's ER/Studio.

[A few weeks after submitting this review, Paul wrote that "I have to say that on the surface ER/Studio's support for SQL 7/2000 is better than ERwin. However, its change management is often even worse than ERwin, especially if you're changing key fields or relationships, as it gets confused unless relationship names are exactly synchronized between the database and the model. After altering the model and then asking it to generate a change script to modify the database, I've had tables that are temporarily renamed then not dropped, some relationships that are transferred to new tables and some that are left on the old ones, and changes to a primary key got it completely confused. I've submitted just as many help desk calls to Embarcadero as I did to CA's, and although the response is friendlier than CA, the outcome is similar (escalated to developers). Perhaps I'm asking too much of these tools, but they do cost thousands of dollars, and for that I expect better than shareware."—Ed.]

Link to www.ca.com/products/alm/erwin.htm

[You can also find a reasonably up-to-date list of CASE tools—including details about their lineage and M&A histories—at www.faqs.org/faqs/software-eng/part4/.—Ed.]

To find out more about Microsoft SQL Server Professional and Pinnacle Publishing, visit their website at http://www.pinpub.com/html/main.isx?sub=57

Note: This is not a Microsoft Corporation website. Microsoft is not responsible for its content.

This article is reproduced from the July 2001 issue of Microsoft SQL Server Professional. Copyright 2001, by Pinnacle Publishing, Inc., unless otherwise noted. All rights are reserved. Microsoft SQL Server Professional is an independently produced publication of Pinnacle Publishing, Inc. No part of this article may be used or reproduced in any fashion (except in brief quotations used in critical articles and reviews) without prior consent of Pinnacle Publishing, Inc. To contact Pinnacle Publishing, Inc., please call 1-800-493-4867 x4209.