Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article is an excerpt from Building Microsoft Access Applications, from Microsoft Press (ISBN 0-7356-2039-3, copyright Microsoft Press 2005, all rights reserved). The author of the book is John Viescas.
John L. Viescas is the author of Microsoft Office Access 2003 Inside Out and the popular Running Microsoft Access books from Microsoft Press. He is president of Viescas Consulting, Inc., a respected provider of database application design and editorial consulting services. He has been recognized by Microsoft Product Support Services as a Most Valuable Professional (MVP) every year since 1993 for his contributions to the community of Access users.
No part of this chapter may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic;, electrostatic, mechanical, photocopying, recording, or otherwise—without; the prior written permission of the publisher, except in the case of brief quotations embodied in critical articles or reviews.
Chapter 5: Verifying Names
In this chapter:
- Detecting Duplicate Names
- Ensuring Names Are in Proper Case
Whenever you need to store person names in a database, you should make an effort to help the user avoid duplicate entries and enter names in the correct combination of uppercase and lowercase letters. The Membership Tracking application is a classic example of an application that stores person names. This chapter discusses two techniques you can implement to make data entry easier and more accurate.
Detecting Duplicate Names
The rules of normalization tell you that you should, whenever possible, choose a simple set of natural values as the primary key of your tables. When you're dealing with person names and addresses, finding a simple combination of fields that will always be unique is usually difficult if not impossible. So, you end up generating an artificial primary key—usually; an integer number—to; uniquely identify each row.
But how do you help the user ensure that a new person about to be added isn't already in the database? You can certainly perform a lookup on first name and last name to discover if someone else is already in the database with the name about to be saved. That certainly is a good start, but what about the case where the user enters a name that is similar? Is Jack McDonald the same person as John MacDonald?
Genealogists have to deal with "near matching" names when searching old record archives. Many old records were hand-written, and the person recording the data was very likely getting the information from someone barely literate who had no clue how to correctly spell the name. So, the person creating the record just guessed. To deal with this problem, the U.S. National Archives and Records Administration (NARA) created a formula, called Soundex, that generates a matching code based on the sounds of letters, not the exact spelling. This formula produces a four-character code, and when the codes for two names match, it's likely that the names are very similar and sound alike.
You can perform exact name matching, or you can use Soundex codes to try to identify potentially duplicate last names and warn the user before saving a new row. To see how this works, open the frmMembers form in the Membership sample database (Membership.mdb). Go to a new record, choose any Title, and type Eriksen in the Last Name field. Type any first name you like and click the Save Record button on the toolbar (or choose Save Record from the File menu). Figure 5-1 shows you how the application responds. Note that the application found a Gail Erickson in the database who is clearly not the same person. However, the code checks last name only because checking both last name and first name is not likely to find potential duplicates—such; as the Jack McDonald and John MacDonald example noted earlier. (The Soundex code for Jack is J200, and the Soundex code for John is J500.)
Figure 5-1. Code that checks for similar names using Soundex warns you when another person in the database has a potentially matching name (Click picture to view larger image)
Microsoft SQL Server has a native Soundex function, but Access does not. However, you can create a Public Soundex function in any standard module and then call it from anywhere in your application, including queries. You can find the code you need in the modUtility module in all the sample databases. The code is as follows:
Function Soundex(strName As String) As String
'-----------------------------------------------------------
' Input: A string
' Outputs: U.S. National archive "Soundex" number
' This number is useful to find similar last names
' Created By: JLV 03/01/2003
' Last Revised: JLV 06/27/2005
'-----------------------------------------------------------
' A Soundex code is the first letter, followed by
' three numbers derived from evaluating the remaining
' letters. Vowels (including Y) and the letters H and W
' are ignored. When consecutive letters return the
' same numeric code, the number appears only once.
' When two letters with the same code are separated only
' by H or W, the second letter is ignored.
' Letters are translated to numbers as follows:
' B, P, F, V = 1
' C, S, G, J, K, Q, X, Z = 2
' D, T = 3
' L = 4
' M, N = 5
' R = 6
' If the final code after examining all letters is less
' than three digits, the code is padded with zeros.
' Working variables:
' String to build the code, string to hold code number
Dim strCode As String, strCodeN As String
' Length of original string, last code returned, looping integer
Dim intLength As Integer, strLastCode As String, intI As Integer
' Save the first letter
strCode = UCase(Left(strName, 1))
' Save its code number to check for duplicates
strLastCode = GetSoundexCode(strCode)
' Calculate length to examine
intLength = Len(strName)
' Create the code starting at the second letter.
For intI = 2 To intLength
strCodeN = GetSoundexCode(Mid(strName, intI, 1))
' If two letters that are the same are next to each other
' only count one of them
If strCodeN > "0" And strLastCode <> strCodeN Then
' Different code number, add to the result
strCode = strCode & strCodeN
End If
' If this is not the special "skip" code (H or W)
If strCodeN <> "0" Then
' Save the last code number
strLastCode = strCodeN
End If
' Loop
Next intI
' Check the length
If Len(strCode) < 4 Then
' Pad zeros
strCode = strCode & String(4 - Len(strCode), "0")
Else
' Make sure not more than 4
strCode = Left(strCode, 4)
End If
' Return the result
Soundex = strCode
End Function
The Soundex function calls the GetSoundexCode function to calculate the numeric value of each letter examined. The code is as follows:
Private Function GetSoundexCode(strCharString) As String
'-----------------------------------------------------------
' Input: One character
' Output: U.S. National archive "Soundex" number
' for the specified letter
' Created By: JLV 03/01/2003
' Last Revised: JLV 06/27/2005
'-----------------------------------------------------------
Select Case strChar
Case "B", "F", "P", "V"
GetSoundexCode = "1"
Case "C", "G", "J", "K", "Q", "S", "X", "Z"
GetSoundexCode = "2"
Case "D", "T"
GetSoundexCode = "3"
Case "L"
GetSoundexCode = "4"
Case "M", "N"
GetSoundexCode = "5"
Case "R"
GetSoundexCode = "6"
Case "H", "W"
' Special "skip" code
GetSoundexCode = "0"
End Select
End Function
To perform the last name similarity check, I added code to the BeforeUpdate event procedure of the frmMembers form. For a new record, the code opens a recordset that looks for any other member records that have the same Soundex code for the last name as the last name in the record about to be saved. When the recordset finds any matches, the procedure displays all the potential duplicates and gives the user a chance to cancel the save. The code is as follows:
Private Sub Form_BeforeUpdate(CancelInteger)
Dim varID As Variant rst As DAO.Recordset, strNames As String
' If on a new row,
If (Me.NewRecord = True) Then
' Check for similar name
If Not IsNothing(Me.LastName) Then
' Open a recordset to look for similar names
Set rst = DBEngine(0)(0).OpenRecordset( _
"SELECT LastName, FirstName FROM " & _
"tblMembers WHERE Soundex([LastName]) = '" & _
Soundex(Me.LastName) & "'")
' If got some similar names, issue warning message
Do Until rst.EOF
strNames = strNames & rst!LastName & ", " & rst!FirstName & vbCrLf
rst.MoveNext
Loop
' Done with the recordset
rst.Close
Set rst = Nothing
' See if we got some similar names
If Len(strNames) > 0 Then
' Yup, issue warning
If vbNo = MsgBox(gstrAppTitle & " found members with similar " & _
"last names already saved in the database: " & _
vbCrLf & vbCrLf & strNames & _
vbCrLf & "Are you sure this member is not a duplicate?", _
vbQuestion + vbYesNo + vbDefaultButton2, gstrAppTitle) Then
' Cancel the save
Cancel = True
End If
End If
End If
End If
' Additional code not related to this example ...
End Sub
Note Throughout the sample code, you'll see me use DBEngine(0)(0) (equivalent to DBEngine.Workspaces(0).Databases(0)—the; first database in the first workspace) to set a pointer to the current database. In the help files, Microsoft recommends using CurrentDb instead because it makes an independent copy of the current database object, and you're less likely to run into conflicts if multiple users have the same database open at the same time. However, it's a bad idea to let multiple users run the same copy of your code because doing so is more likely to result in locking conflicts and corruption of your database. Also, CurrentDb is much slower because it reloads and revalidates all the objects in the database. If you implement the client/server architecture recommended in Chapter 3, each user has his or her own copy of the database where this code executes, so conflicts are not an issue.
As you might recall, setting the Cancel parameter to True tells Access to not save the record. The user can then either clear the duplicate record or make any corrections before saving the record.
Ensuring Names Are in Proper Case
When users type in names, some might use the Shift key to properly capitalize names as they enter them. But if you want to ensure that names are entered in proper case (perhaps to ensure that names appear correctly on name badges or in correspondence), you can add some code to your forms to assist the user. If you start to enter a new member in the frmMembers form in the Membership database (Membership.mdb) and enter the name in all lowercase, the application prompts you with an appropriate correction, as shown in Figure 5-2.
If you're familiar with the StrConv function, you might think that using it with the vbProperCase argument would be useful to verify uppercase and lowercase letters in a name, but you'd be wrong. Open the Immediate window by pressing Ctrl+G, and type the following:
?StrConv("macdonald", vbProperCase)
The function responds with:
Macdonald
Figure 5-2. The frmMembers form prompts you with a suggested correction when you do not enter a name in proper case (Click picture to view larger image)
The StrConv function also doesn't handle embedded apostrophes (O'Brien) or periods (J.R.) either. Several years ago, I decided to write a custom function that does a better job with names like this. You can find my SetUpper function in the modUtility module in any of the sample databases. The code is as follows:
Function SetUpper(ByVal varFixCaseVariant) As Variant
'-----------------------------------------------------------
' Inputs: A string containing a person name
' Outputs: Updates the string to "proper" case
' Created By: JLV 07/31/1998
' Last Revised: JLV 08/10/2005
'-----------------------------------------------------------
Dim i As Integer
Dim intSkip As Integer
Dim intASC As Integer
Dim strUpper As String
Dim strLast As String
' Make sure the argument is a string
If VarType(varFixCase) <> vbString Then
SetUpper = Null
Exit Function
End If
strUpper = varFixCase
strLast = " " ' set starting "last" character to blank
' First, set everything to lowercase
strUpper = LCase$(strUpper)
' Loop through each character, 1 at a time
For i = 1 To Len(strUpper)
' First, see if we're skipping some letters (special cases)
If intSkip > 0 Then
intSkip = intSkip - 1 ' Yes. Decrement skip count
Else
' If "last" character was a blank (or start of string),
' do some special tests for "O'", "Mc", and "Mac"
If strLast = " " Then
If Len(strUpper) - i > 2 Then ' If at least 3 characters left
' Then if the next two are O' (as in O'Brien)
' or Mc (as in McDonald)
If Mid$(strUpper, i, 2) = "o'" Or _
Mid$(strUpper, i, 2) = "mc" Then
' Make the "O" or "M" uppercase
Mid$(strUpper, i, 1) = UCase$(Mid$(strUpper, i, 1))
intSkip = 1 ' and set up to skip the next character
End If
' Or if the next three are Mac (as in MacDougal)
If Mid$(strUpper, i, 3) = "mac" Then
' Make the "M" upper case
Mid$(strUpper, i, 1) = UCase$(Mid$(strUpper, i, 1))
intSkip = 2 ' and set up to skip the next 2 characters
End If
End If
End If
' If not skipping characters
If intSkip = 0 Then
' Not a special case, so see if "last"
' was a letter or an apostrophe
' Use the ASCII value to avoid having to do binary compares
intASC = Asc(strLast)
' 39 = ', 97 = a, 122 = z, 65 = A, 90 = Z,
' 224 and higher are foreign language letters
' except 247, which is a division sign
If intASC = 39 Or (intASC >= 97 And intASC <= 122) Or _
(intASC >= 65 And intASC <= 90) Or _
(intASC >= 224 And intASC <= 246) Or (intASC >= 248) Then
' If previous character was a letter or
' apostrophe, then leave this one alone
Else
' If previous WASN'T a letter or apostrophe, Upper this one
Mid$(strUpper, i, 1) = UCase$(Mid$(strUpper, i, 1))
End If
' Save this character for next go-around
strLast = Mid$(strUpper, i, 1)
End If
End If
Next i
End Function
As you might suspect, the code that checks the last name uses the SetUpper function to create a comparison value. Because the comparison must be case-sensitive (all comparisons in an Access database are case-insensitive by default), you must use the StrComp function to ask for a true binary compare. Here's the code in the AfterUpdate event of the LastName control on frmMembers:
Private Sub LastName_AfterUpdate() strUpper As String
' Get the corrected uppercase
strUpper = SetUpper(Me.LastName)
' See if it matches what the user entered
If StrComp(strUpper, Me.LastName, vbBinaryCompare) <> 0 Then
' Ask the user if this code should fix it
If vbYes = MsgBox("Would you like to correct the last name to: " & _
strUpper & "?", vbQuestion + vbYesNo, gstrAppTitle) Then
Me.LastName = strUpper
End If
End If Sub
Of course, you could just update the name with the corrected value without asking the user, but the SetUpper function might get it wrong. For example, the SetUpper function will correct Mackey to MacKey. You must weigh that possibility against the potential annoyance of asking the user to verify the correction for each name. The user can avoid the prompt by entering the name in the correct case to begin with.
As you might imagine, you can apply the techniques described in this chapter to any application that must store people's names. In the next chapter, you'll learn how the Membership Tracking sample application tracks member activities.