Morris Ullman, Chief of the Statistical Reports Section at the Bureau of the Census, headed the project that produced Historical Statistics of the United States 1789-1945. The following is a talk he presented to the Philadelphia Chapter of the American Statistical Association while Historical Statistics was still in press. The present manuscript was transcribed in September 1999 by Carmel Ullman Chiswick, from the original typewritten draft (dated 2/17/49) with Ullman’s handwritten notes for revision. The original chart was colored pencil on an 18”x 22” piece of graph paper.
The volume “Statistical Abstract of the United States” has long occupied a revered position in the list of statistical and reference publications. Each year since 1878 a new edition has appeared to serve as a convenient summary to the technician and as a principle tool to the reference librarian. It is in the same category as the dictionary, regularly purchased and, by many, steadily ignored.
The first edition of the Statistical Abstract was issued by the Bureau of Statistics of the Treasury Department. When that Bureau moved to the newly formed Department of Commerce and Labor in 1903, the Abstract moved with it. It was issued by the Bureau of Foreign and Domestic Commerce from 1912 until 1938, after which the compilation became the responsibility of the Bureau of the Census. In April 1943, the preparation of the annual volume was moved into the Office of the Statistical Assistant to the Director of that Bureau. It is at this point that my story for this evening begins.
At the time that our office took over the preparation of this volume, the 1942 edition was ready to go to press. No matter what we tried to find out about the Abstract, its use or its users, we were faced by a blank wall of acceptance. Everyone who used the Abstract accepted it. It was an authority beyond criticism, yet we found much to criticize in it. Everyone used it, but they evaded our questions as to how they used it and no one could tell us what information they found lacking. It was on every list of standard reference volumes, yet only 5,000 copies were sold each year. In the midst of plenty of praise, we found a poverty of information. So, being statistically trained, we decided that the time had come to make a survey. By last minute arrangement, we inserted a postcard in the 1942 edition requesting users to give us a few simple facts which might help us measure the manner of use of the book.
This survey was primarily exploratory and did not meet scientific standards. We had no record of who bought the Abstracts and only a partial record of those who received free copies. Part of the edition is a House document available to Members of Congress. More copies go automatically to depository libraries and to foreign countries under foreign exchange agreements. The Bureau gives away a considerable number of copies to additional libraries and to other Government agencies including a supply to the State Department for distribution to U.S. offices abroad. All in all, a maximum of 9,000 replies to the postcard were possible, but we had no way of checking on the returns. There were no possibilities of control and none for follow-up. We could not hope to get a good sample of our universe, but we did hope to get some ideas.
Cooperation was better than expected. In spite of lack of control and the inclusion of large segments where replies were unlikely, over 2,000 cards, or over 20 percent of the maximum possible, were returned. In addition to the few facts we requested, we also received a number of free remarks and several dozen letters. These, added to the results of personal inquiries and our own experiences and observations, started the formulation of a program which is now approaching a point of crystallization.
May I inject at this point the observation that I am purposely omitting the day to day work involved in the preparation of the volume and the excellent work of the staff in carrying out the tremendous job of collecting and examining reports, evaluating the material in these reports, selecting and annotating the data to be shown, designing the large variety of tables, and nursing the book through the publication processes, including the proofreading. I am not going to dwell tonight on the numerous problems of catching up with accumulated errors, of making selections of material from an up-to-date viewpoint, and on the research necessary to determine whether the material shown is useful and usable. We have been very fortunate in the quality of our staff, and each edition attests to their fine work. However, tonight we will join the multitude and take this work for granted, while we indulge in the luxury of discussing the broader aspects of this publication.
Our little post-card survey confirmed many of our impressions. Most of the copies of the Abstract go to libraries, both public and special. Government agencies also made up a significant proportion of users. Less than 10 percent of the cards returned failed to give an organizational affiliation of a sort that would indicate widespread use of each copy.
More important, however, was the impression created that the annual volume itself was not sufficient. Predominant among the criticism, mostly implied rather than direct, was the desire for more detail along two main lines – one geographic, the other historical. Requests included suggestions for an assemblage of summary material for local areas, and a desire for data for years not shown in the Abstract tables, which, in many cases, show only each fifth or tenth year or group the data for five or ten years for earlier periods.
We were also impressed with the wide use of the Abstract to find out where to go for various types of data. This fact also called our attention to our central position in acting as a clearing house for sources of statistics. Not only did we have a wealth of material in our files, but we were also fortunate in our periodic contacts with almost every producer of statistics. We were part of a large statistical office in the largest center of statistics production in the country and could obtain expert advice almost at will. Added to this was the intangible asset of good will, for the position of the Bureau of the Census in the statistical world, and the record of the Statistical Abstract, ensured a serious reception to almost any request we cared to make.
These facts, as they emerged, made us conscious of the fact that we were responsible for more than just a book. We conceived of our obligation as the function of bringing together for the user with limited resources the best summaries of available data, properly annotated to be as useful as possible.
As this responsibility was borne upon us, we decided to send up a trial balloon. After marshalling our resources, we examined data available from all sources on cities. We selected 79 items which, in the opinion of a considerable number of people, seemed most useful and prepared the 47-page Cities Supplement, which showed these data for all cities with 25,000 or more inhabitants in 1940. This pamphlet was published in September 1944 and has since gone through three printings. We were encouraged to undertake a similar project for counties and, in September 1947, we issued the County Data Book. I believe it can be said in all fairness that these publications filled a real need, as evidenced by the substantial reception which they received. In general, they satisfied a part of the demand for summary material for small geographic areas. Our present plans call for continuing these two publications as one report, a Small Area Data Book, which will have data for both cities and counties in the same volume.
While we were still working on the County Data Book, we received in the Bureau a memorandum from J. Frederic Dewhurst suggesting a Source Book of Economic Statistics. It was proposed that this include in a single volume a selected group of the most important of the comprehensive statistical series available on an annual or a decennial basis over a considerable number of years. This mimeographed memorandum had been widely distributed among interested persons. The proposal had reached the point where a joint committee of the American Economic and American Statistical Associations had been created to consider the feasibility of such a volume. The Economic History Association had also been invited to send representatives to this joint committee since they were very interested in this project.
The proposal struck a responsive chord in the Bureau of the Census since it was very much in line with the second type of request which we had received for the extension of the Statistical Abstract, namely, the filling in of data for years which were not shown in the Abstract volumes. It also provided a solution to another situation. Very often, in trying to find historical data, original source volumes are not available. An increasing number of persons are turning to collections of older editions of the Abstract and transcribing needed data year by year. For many series this procedure is all right, but in general it is fraught with danger. The Abstract is, after all, merely a condensation and usually does not contain the warnings of the source. The principal danger, however, is that there is a good chance of missing the later revisions of data and of definition. Dr. Dewhurst’s proposal, therefore, seemed to provide for a volume which not only would answer requests for data for each year, but would be a substitute for, and an improvement over, the practice of using files of older Abstracts for developing historical series.
Informal discussions were carried on with the Joint Committee as to the possibility of joining forces. The proposal for the Source Book was brought up before the Committee on Problems and Policy of the Social Science Research Council in July 1945. A resolution was passed recommending that the Secretary of Commerce give consideration to the compilation of such a volume by the Bureau of the Census. When received, this resolution was accepted by the Secretary of Commerce and the Bureau of the Census actively began planning this book. At the final meeting of the Joint Committee in May 1946, a Census proposal was presented and accepted.
Since estimates had been included in the Census budget for needed funds, the project, it might be said, was officially under way. Two developments furthered the project. First of all, the Committee, having decided that with the participation of the Bureau of the Census such a volume was feasible, dissolved itself and a new Committee, broader in scope, was set up by the Social Science Research Council to advise the Bureau of the Census on the preparation of the volume. Secondly, the Committee on Economic History of the Social Science Research Council made available funds which made possible the employment of a full-time executive secretary for this Committee. The Secretary worked with the Bureau of the Census and specialized on the procurement of data from consultants.
Basic discussions at this early stage involved the character of the material to be included in the volume. Proposals varied all the way from having a thorough analytical treatment of a relatively small number of series, say two or three hundred, to a very hasty compilation of a very large number of series with practically no text. Pros and cons were extensively debated and a number of ideas emerged. The thorough analytical treatment obviously was not conceivable without intensive research. The advisability of such treatment on a general level was also questioned. The working plan finally agreed upon was to prepare a fairly large number of series with good, specific annotations as to content and source.
It was intended to enlist extensive help and review in the selection and annotation of the series through a large body of consultants on all subjects. The Bureau of the Census was to prepare lists of available series. These lists of series would then be sent to consultants who would review them and comment. Comments were to be coordinated by the Bureau of the Census. The series were to be assembled either by the consultants or the Bureau, whichever seemed to be the most practical in each instance. Lists of series mentioned above were prepared for about a dozen subjects and copies were ready to be sent to consultants, but they never left the Bureau. In trying to detail our request, it became obvious that the person receiving such a list of series would also need to look at the data in order to arrive at a considered judgment. To make data available would mean that a volume would have to be compiled before consultants could be effectively utilized. The modification of plans brought about by the recognition of this situation was largely that of limiting the consultants on the first volume to those who already were most familiar with the data and most conveniently located geographically – namely, to the agencies and technicians in Washington which were largely responsible for the production of the statistics to be used.
In the meantime we had prepared a tentative table of contents based largely on material which we knew or suspected to be available. We began to approach a large number of agencies with requests (a) that they prepare a specific chapter or section, (b) that they prepare a presentation of series in their subject field to be incorporated as part of a larger presentation, (c) that they make suggestions which would enable the Bureau to prepare these series without undue burden. We had decided not to compile data from older Abstracts unless we were forced to. We wanted to go to the sources of data so that we should get the benefit of the latest information available and minimize the accumulation of error. Almost everyone we approached agreed to the utility of the volume. Almost everyone agreed to the need for reviewing the historical material and eliminating some of the defects which were known to exist. However, most Government agencies operate on year-to-year basis. They are all intensely concerned with the current problems and it is something of a luxury to work on historical material. Cooperation was excellent, however, and gradually the material was assembled.
Since the funds which the Bureau of the Census had for printing this volume were to expire in June 1947, it was necessary that all material be at the printer by that time. The selection, assembling, posting, correction, preparing of tables for the printer, and the writing of text were concentrated between October 1946 and June 1947. The material was given as careful a review as time permitted. Much had to be eliminated or re-arranged. The telephone was actively used for clarifying the text and making many last-minute changes. A number of selections had to be omitted either because the material did not arrive in time or in a form which required more research than we could give. Material on education and communications was received but we did not have the chance to properly evaluate it. It therefore had to be omitted. The Department of Agriculture, which did an excellent job on the chapter on Agriculture, could not get the basic series on Sugar and Tobacco to us in time. These two series therefore do not appear in the volume. A number of other topics which we should have liked to have included do not appear because data were not readily available and sufficient time for research could not be devoted to finding and evaluating appropriate material. Our attitude in June 1947 was that we had a good preliminary volume which would serve as a manuscript when the work of revision would be undertaken about 5 years hence.
In a few months galley proof started to flow back from the printer. We proceeded to check the proof in order to clarify any necessary points as part of the procedure permitting the various agencies to give their material a final review. At that time, it was not our intention to make extensive revisions. We were rather anxious to have a volume appear. Our checking however indicated that while we had an excellent collection of material, many points had not been fully considered. Citations were not as careful as they should have been, and definitions were not sufficiently complete.
Little by little, we were drawn into the problem of making extensive revisions. We would mark up a set of galley proofs and send them to agencies which had had six months or more in which to let our earlier request mature. We discovered that during the 6 to 12 months between the time that we had received the material from the agencies and sent them the galley for review many agencies had gone back and actually revised much of their historical material. For example, three agencies had completely revised their historical material and had published the new figures. We decided that it would not be advisable to include the unrevised figures in the historical volume while publications of the agency showed more recent data. In other cases a more careful review of the material in type uncovered defects not apparent in the original submittal. In checking back data presented in historical series which were first furnished to us, agencies discovered that some of those series had been improperly compiled. In some cases they actually showed incorrect trends. One example is illustrated on this chart which indicates the difference between the figures as they appeared in galley proof and as they were corrected by the agency for final presentation.
From October 1947 until the fall of 1948, therefore, the Bureau of the Census found itself with a major problem of revision on its hands. One person spent 9 months just checking every source cited in the volume to make sure that the data presented were obtained from the source given and that the exact title of the source was used. A great many differences were uncovered which would have been quite misleading if they had been permitted to appear.
Let me review briefly what we tried to do in the way of text. Since this was a general purpose volume, we obviously could not limit discussion with specific problems in mind. We had to present data so that they would be generally usable on a large variety of problems. We therefore set out to present the best series available for which the methods of preparation were known. We wanted to include in the text the exact source where these figures could be obtained, the various specifications for the series such as area covered, time period reference, definition of terms, and method of collection if necessary. This information was to be supplemented by anything which was considered necessary for intelligent use of the data, as, for example, adjustments made for purposes of series linkage. These objectives could not be met for each one of the almost 3,000 series which finally appeared in the volume. As much as possible was done in each case. As a matter of interest I would like to mention a few of the problems which we encountered.
In some cases, we cannot be certain whether data shown are for Continental United States as a whole, or whether data for one or two territories are included. An example of this was the group of series on forest areas. The data furnished were compared with series from other sources and substantial differences discovered. We did what checking was possible and queried the agency. It developed that one series was revised and excluded Alaska; the other was unrevised and included Alaska. Neither source referred to the inclusion or exclusion of territories. We were not able to determine at which year Alaska figures were first included in the series.
Another type of problem which arose involved the shifting of definitions during the production of a series. In several of the mineral fields, for example, early data are simply labeled “production” without explanation. For an intermediate period, the same figures are sometimes labeled “production” and at other times, “shipments”. Careful examination demonstrates that the older “production” series is continued as “shipments” and data are probably comparable. The modern “production” series is not comparable with the earlier series so labeled. In some instances, we could not determine specifically over what years one definition was used and over what period the other. During the years of compilation, the agency, like many others, saw no need to define its terms sharply – an attitude which still prevails in many quarters, as can be demonstrated by examination of government reports. We cannot help but feel that accurate labeling, even if only in a footnote, would help the user.
Producers of statistical series, especially in fields where current data are issued, include some who do not feel any qualms of conscience about these terminology changes, and in other instances the revisions have apparently been buried in the shifts of bygone years. In some instances, it was not possible to find out when the shift was made nor was it possible to determine whether there was an overlap period which would indicate relationship between the two definitions. A common practice, when a new term or concept is adopted is to merely substitute the new term in the column head of historical tables after adding data for the most recent year. Thereby all data for earlier years are suddenly given a new name. Thus, the series on customs receipts included figures which had been labeled “calculated duties” and “collected duties,” at different times. At least in modern times, these are actually two different concepts which, although closely parallel, are measured by different numbers. At some time in the past numbers which had appeared as “collected duties,” suddenly became “calculated duties.” Checking earlier reports merely added to the confusion, for apparently the terms were used interchangeably. An exact determination of the correct label was never made in this instance. In situations such as these, we carried our research as far as we could then explained the situation in the notes as fully as possible. In these instances, we hope the users may shed light on this situation in time for the revision.
A common difficulty in historical compilations and sources is the inadequacy of text and footnotes. The case is illustrated by the 1936 edition of Merchant Marine Statistics. It has no text and is replete with clerical errors very easily uncovered by the simple process of addition of detail to totals. The necessity of referring back to a number of annual reports to check some of the errors uncovered also brought to our attention very important footnotes which appeared in earlier issues but had disappeared in later issues. In some cases, these footnotes had been re-phrased to the point where they were misleading.
In a few cases, the footnotes had been accidentally shifted so that they were placed against wrong years and the error was continued in later editions of the annual report. Other footnotes were not complete. For many years in historical tables on American Merchant Marine tonnage, a note has been carried against the years 1818, 1829, and 1830, pointing out that the tonnage decrease arises from a correction of registered tonnage in those years by the striking off of registered vessels which were determined by the collectors to have been lost at sea, etc. A routine check of the source cited for this (The American State Papers, Commerce and Navigation) revealed that this type of arbitrary reduction began in 1800. From that time until at least 1870 significantly large amounts of tonnage were dropped at frequent intervals. These clearances of tonnage accounts have had an important effect upon the changes from year to year. The footnote, although not too clear for the dates indicated, necessarily gives the user the impression that only the specified years were affected. Actually, many other years were affected far more seriously, and even the validity of data for years prior to 1818, at least, is thrown into doubt.
It might be said in general that footnotes in these historical tables are usually prepared with the specific purpose of the compiler in mind. Complete annotations of all situations which might arise is an impossible requirement but it would appear that a more general approach would be very helpful.
As a last example of some of the problems which arose, let me mention that the use of historical material may uncover faults of preparation which can be corrected. In the instance of gold and silver production, we were furnished with an historical series which did not agree with the series shown in the standard historical presentation of the responsible agency. A check revealed that the series as given us had been prepared independently and that the standard series, published for many years, should be superseded. In the instance of merchant vessel data, it was discovered that the total tonnage for 1868, published annually since 1880, accidentally omitted a sizeable segment of the fleet.
These problems are all important and concerned us intensively while we were engaged on the preparation of the volume. Actually, they effect a relatively small proportion of the total number of series. They indicate very clearly the need for constantly being on guard when using historical data.
To these problems can be added the errors which arise through the mechanical transfer of data from one source to another. There are possibilities of incorrect transcription, of typographical errors, and of incomplete definition. The user must always exercise caution in using data to make sure that the data at hand are adequate for the purpose. Where important decisions are involved, there is no substitute for study of the original source.
In spite of all this, however, the volume which will be available in two months, titled “Historical Statistics of the United States,” and which is the third supplement in the series of Abstract publications, will be a very useful document. For $2.50 you will have a convenient attractive 350 page book which contains approximately 3,000 series going back through time. The subjects covered are almost as comprehensive as the annual Abstract volume. There is approximately one page of text for each two pages of tables, giving information which we feel you will need when you use the data.
The first chapter presents historical data on Wealth and Income. It was prepared by Harlow D. Osborne of the National Income Division of the Office of Business Economics. This chapter was first conceived as a summary of the historical development of the American economy. The concept of such a summary exceeded the physical limits of our resources and the field was gradually narrowed. The idea of a summary of the American economy in terms of historical data is an idea which I hope is not lost. If not developed earlier, this will probably be one of the principal problems to be taken up when the revision is undertaken.
A special Appendix covers monthly and quarterly indicators of business conditions. This Appendix was prepared by Geoffrey H. Moore of the National Bureau of Economic Research. One of the early problems we had to face was an intense interest in historical data needed for the study of business cycles. Annual data may be adequate for general trends, but they definitely are not adequate for more detailed studies. Our decision to concentrate on annual data did not mean that we minimized the importance of the more frequent time period. To supplement our presentation, we arranged for the National Bureau to select approximately 30 of the more useful monthly and quarterly series which we are presenting here along with a small table showing the turning points of business cycles as determined by the National Bureau.
Between the first chapter and the Appendix are chapters or sections on Population, Vital Statistics, Labor, Agriculture, Minerals, Power, Construction and Housing, Manufactures, Transportation, Price Indexes, Foreign Trade, Banking, Governments, etc. Some chapters were prepared by a single agency; others contain material from several contributors; others were prepared in whole or in part by the Census staff. Certain subjects have been omitted but, as you can see, our coverage is still pretty comprehensive.
The work of selection, compilation, and annotation has been done with considerable care although I will not pretend that the book itself will be without error. You will find, however, that the standards which have traditionally governed the preparation of the annual Abstract have been rigorously applied and, I might even add, in the case of this volume, much more intensively. In some instances the annotations in the historical volume are probably more adequate than the source reports. In most instances, however, only the source volume can give you the detail sufficient for a highly intensive analysis of your data. It would probably still be preferable to have this one book rather than a library of volumes. It is also important, therefore, to keep in mind the fact that if you do uncover weaknesses of presentation or errors, please let us know about them. In that way we are able to take into account the faults and incorporate whatever changes are necessary in the next revision. Sometimes the changes called for may also apply to the data in the annual volume.
In conclusion, I would like to comment briefly on the future. In preparing the Historical Volume, we have not used the data found in the annual Statistical Abstract of the United States except in a few instances. In all cases, we have gone back to the source. The preparation of material for the Abstract had become fairly routine after 70 years and mechanical errors which had been introduced 20 and 30 years ago had been continued. The result of this has been to uncover many things which must be changed in the annual volume. We intend to integrate the presentation in the annual volume with the historical supplement. In the future the annual volume will emphasize more current information whereas the technician who requires historical material can find that type of material brought together for convenient use in one book. This job of revision is planned for the next two or three years so that the new information and the new ideas which have evolved can be introduced into the annual series. In the meantime files of information are being built up which will be very useful in revising both the annual volume and the historical supplement. About 1952 the preparation of the second edition of historical statistics will be underway. The job which will face us then will in some ways be more extensive than the job which we have done now. It will have the advantage of having a more advanced starting point. We will have a formal presentation which can be evaluated for usefulness and clarity. Gaps will be apparent and weaknesses will have been studied, and we hope that the information needed to correct some of these faults will be available.
By that time we hope to have issued a new supplement for geographic areas. Whether these volumes will be issued or whether there will be additional supplements to the Statistical Abstract will depend, of course, as does every government project, on the interest and needs of the public. If the sale is high or if there is other evidence that this is a proper function for a government agency, the program will continue. If interest is lacking, there will be no support for its continuation.
As a last point, let me raise a very general problem. There has been considerable discussion of standards for the production of primary statistical reports. The Division of Statistical Standards has prepared a statement which has been agreed to by the heads of the major statistical agencies indicating the type of information which should be included in reports of surveys. The results of the recent pre-election predictions have also brought the problem of furnishing complete information on concepts and problems along with the statistics which results from the survey. I would like to point out that there is also a problem of standards for compilers of secondary sources. This group also includes persons who reproduce data in technical articles, sometimes carelessly and without adequate description. Perhaps this is a subject which the Committee on Standards of the American Statistical Association may find appropriate to include on their agenda for future consideration. Our experience in preparing the historical supplement would indicate this a problem worthy of consideration.
Back to the Archive home