Deliverable 1: Abstraction

Objectives: Gain experience using abstraction to design classes by assessing possible attributes and behaviors and capturing those that are relevant to the project at hand. 

Description: This program is intended to provide you with experience designing classes (abstraction).

Recall that abstraction requires the designer to ignore irrelevant features, properties, or functions and emphasize only those that are relevant to the given project.

As noted in the project introduction, There are several different ways to format references in a research paper.  Probably the most common ones are the APA style, the MLA style, and the Chicago style.  You can find examples of each format in the Reference Styles help sheet.

This program is the first component that you will develop in the process of designing and implementing a reference formatting system.  Notice that every reference, regardless of the style, includes the author's name (or authors' names) and a publication date.  As an exercise in abstraction you will consider each of those independently.

Part I: clsName

First, consider a name.  What are the possible attributes that can be associated with a name?  That is pretty easy.  You have a first name and a last name.  And middle name. Or is that middle initial?  What about those odd ducks with two middle initials, like J.R.R.Tolkien?  There is often a suffix as well, such as PhD., MD, Sr., Jr., III, etc. And many names have a prefix like Mr., Mrs., Ms., Dr., Don, Doña, Sr., Sra., although it could be argued that they are not really part of the name.  You can read more on suffixes here:

You must also consider two word last names (e.g., St James) and hyphenated last names (e.g., Smith-Cross).

And those are just the types of names that students in English-speaking countries are most familiar with.  When designing software one must take into account international factors that can impact design.  One site,, explores this consideration in relation to names.

The authors of the KISS site point out that in "the rest of the world" doesn't necessarily conform to the "first name, middle initial, and last name" model.  The reason is that there are 2, not 3, major components to people's names:

1. Given-names: these are the names given to children by their parents (or, rarely, are changed by the children).

2. Family-names (otherwise known as surnames): these are the names passed down from generation to generation (except in Iceland).

Example 1: Mary Elizabeth Smith has two given-names and one family-name. If she calls herself Mary, then she has a first name, can use a middle initial, and has no problem with the the standard format.

Example 2: Supposing, however, that she has been called Elizabeth (Liz for short) since birth. Then, her name won't fit the standard forms. Neither will that of J. Edgar Hoover (a former FBI director) and many others. You can believe that Liz would not want to have to answer to the name Mary just because someone designed a form that records only her first name and middle initial.

Example 3. Liz Smith marries someone called Jim Brown. She may call herself Elizabeth Smith, or change her name to Elizabeth Brown, or Elizabeth Smith Brown. Her name Mary still is first, but she hardly ever uses it. So, what is now her "middle initial"?

Example 4: Ada María Guerrero Pérez is from Mexico. Her names (nombres) are Ada María (and she always uses both these names), her primer apellido (father's family name) is Guerrero and her segundo apellido (mother's family name) is Pérez. You would find her in a Mexican phone book under "Guerrero Pérez, Ada María." She calls herself Ada María Guerrero. So what should she do when she encounters a US form asking for her "first name, middle initial, and last name"?

Example 5: Ada María marries someone called Alfonso Ernesto Hernández López. He has two given-names (and uses the second of these), and two family-names (Hernández and López). You find him in a Mexican telephone book under "Hernández López, Ernesto." He calls himself Ernesto Hernández. How does he respond to a US form asking for first name, middle initial, and last name?

Example 6: Ada María has new problems with US forms: after her marriage she is Ada María Guerrero de Hernández. So now how does she respond to a US form asking for first name, middle initial, and last name?

Example 7: Li Xiao Ping is from China. In China, Japan, Vietnam, Hungary, and some other countries, the family-name (Li) comes first. The two components (Xiao Ping) of his given name are used together as one name such that they could almost be written Xiaoping. You find him in a Chinese phone book as Li Xiao Ping (written in 3 chinese characters with no comma). How should he respond to a US form asking for first name, middle initial, and last name? Which of his names is last?

Now designing a name class doesn't seem so simple after all!  However, when engaging in object-oriented
design you begin by considering all possible attributes, you then narrow those attributes down to those that are relevant to the problem at hand. 

The problem at hand is a reference formatter.  Look again at the Reference Styles help sheet to see what information is required to represent an author's name.  If it does not contain enough examples to help you get a feel for internationalizing the design, Google the terms "reference styles", "MLA", "APA", or "Chicago".

Abstraction is not limited to attributes.  Once you complete the abstraction process with regard to attributes, you next need to address functional abstraction, a process in which you determine which functionality is important in much the same way as you determine which data items are important.

Functional abstraction takes into account such things as accessor method (those used to return the values of instance variables), mutator methods (those used to change the values of instance variables), constructors (those methods used to initialize the instance variables), and any necessary data validation or data conversion methods.

You should consider the necessary constructors. There will always be a last name, But first name may be a name or an initial.  (You can't always find an author's first name on a publication if they only use an initial. J.R.R. Tolkien is a good example.)  Will there always be a middle name or initial?

Consider mutator methods.  Are they needed for individual attributes or do you want to provide value for all attributes, or a subset of them, at once?  There are instances in which you want to create "read-only" instance variables in which case no mutator methods are provided, only accessor methods.

Shifting our focus to accessor methods, the different formatting styles often use variations on how the author's name is displayed, and this affects the components that you must consider.  Even if you limited your scope to just first name, last name, and middle initial you have several combinations.  The fact that middle initial is either omitted or must follow the first name if it is included narrows the possibilities. The fact that first name may be the complete first name or a first initial broadens the possibilities.

Permutations with middle initial:

Anthony T. Jones
Jones, Anthony T.
A.T. Jones
Jones, A.T.

Permutations without middle initial:

Anthony Jones
Jones, Anthony
A. Jones
Jones, A.

So there are eight combinations even if you ignore the suffix, two word last names, hyphenated last names, etc.  Again, review the Reference Styles help sheet to see what formats are required to represent an author's name.  If it again does not contain enough examples to help you get a feel for internationalizing the design, Google!

Now consider validation methods.  How do you validate the data stored in a name?  You can test the value character by character to be sure it is not blank, not numeric, or non alphabetic.  Is that adequate?  Sure, for most American names.  However, a recent article pointed out that apostrophes, hyphens, or spaces in names must be accounted for as well.  Therefore you should be sure that each character is uppercase (65-90), lowercase (97-122), or a space (32) for Dutch names like van Kemp, apostrophe (39) for Irish, French, Italian and African names like O'Connor, period for names like St. James, or hyphen (45) for Arab names like Al-Hussein or hyphenated married names like Joyner-Kersee.

You should also provide utility methods like getInitial that will extract the first letter from a first or middle name.

A simple UML class diagram depicts classes as boxes with three sections, the top one indicates the name of the class, the middle one lists the attributes of the class, and the third one lists the methods.  For example, a class diagram of clsFlight (from an airline reservation system) is shown below.

Use a UML class diagram to model your clsName

Part II: clsDate

Now consider how a reference formatting system stores and displays a date.  Notice that the different formatting styles often use variations in how the publication or conference date is displayed.  All references include at least a year of publication, while some, like proceedings, include a complete beginning date and a complete ending date, like October 31, 2008 - November 2, 2008 or November 7-10, 2008.

This program will require you to create a programmer-defined clsDate class that will allow the storage and display of publication dates in different formats.  Most OO programming languages provide a built-in date class.  Can it be used to represent a publication date so that we don't have to develop our own?  In most cases the answer is no.  As noted above, some publications include only a year, some include only a month and year, and some include an entire date. Therefore, your class must allow the storage of a 0 value for day for instances in which only the month and year of publication are known, and should also allow the user to specify both a 0 or null day and a 0 or null month for instances in which only the year of publication is known.   Most built-in date classes require non-zero values for month, day, and year so a user-defined date class is required.

Go through the same process that you followed for name to decide on the attributes and functionality needed for your date class.  Discuss with your professor whether you want to store your date in month, day, and year format, with attributes for each, or in Julian date format with a single attribute for all.  You can read more about Julian dates in Wikipedia or required conversion routines here.

You should provide at least three constructors: (1) one that accepts actual parameters for month, day and year, (2) one that accepts parameters for month and year, and (3) one that accepts an actual parameter for year. 

Think about the different formats that can be used to display dates. This site contains n excellent discussion.

Whether you use Julian dates or individual attributes for month, day, and year you are going to have to validate your month, day, and year.  That will require validateMonth, validateDay, and validateYear methods.  It will also require a function to determine if the year is a leap year, since that knowledge is required to validate the day.  Recall that most years that are evenly divided by 4 are leap years.  However, years that are evenly divided by 100 but not evenly divided by 400 are not leap years. Thus 1996 is a leap year, 1900 is not, and 2000 is.

As with clsName, design a UML class diagram to model your clsDate.

Note: Although abstraction requires that you ignore details irrelevant details, you must also take into account future expansion of the project.  How does this affect your design decisions in this case?

Submissions for this deliverable:

  • UML class diagram for clsName

  • UML class diagram for clsDate