Journal Title
Title of Journal: J Regul Econ
|
Abbravation: Journal of Regulatory Economics
|
|
|
|
|
Authors: Tim Loughran Bill McDonald
Publish Date: 2013/11/12
Volume: 45, Issue: 1, Pages: 94-113
Abstract
In October 1998 the SEC implemented a rule requiring firms to use plain English in their prospectus filings In addition to the rule the SEC encouraged the use of plain English in all filings and communication with shareholders Did the SEC rule significantly impact managers’ disclosure style And more interestingly did the SEC’s recommendations lead managers to change their disclosure style in filings not under the plain English mandate Our textual analysis of Form 424 IPO prospectus and 10K filings over 1994–2009 finds that the SEC’s implementation of the plain English rule substantively impacted managerial behavior When we focus on 10K filings we find that after the 1998 rule firms are more likely to improve the stylistic components of their filing before an equity issuance and firms with better corporate governance policies are more likely to comply with the ruleWe thank Robert Battalio Michael Crew Editor Andrew Ellul Margaret Forster Paul Gao Kathleen Hanley Steven Kachelmeier Feng Li Ray Pfeiffer Jennifer MariettaWestberg Simona Mola Richard Sloan two anonymous referees and seminar participants at the 2009 American Finance Association annual meeting University of Amsterdam and University of Notre Dame for helpful comments We are grateful to Betsy Laydon for research assistanceWe use the masteridx file from the SEC web site to identify filings from 1994 to 2009 We then programmatically download each Form 424 IPO S1 10K or 10K405 filing for subsequent parsing Note that until 2003 a box on the front page of the 10K form was to be checkmarked if a “disclosure of delinquent filers pursuant to Item 405” was not included in the current filing nor anticipated to be disclosed in statements incorporated by reference or amendments If this box were checked the form was filed as a 10K405 In 2001 almost onethird of 10K filings were 10K405 forms According to the SEC because there was confusion and inconsistency in making this choice the 405 provision was eliminated after 2002 As this choice has no impact on the focus of our study we included both 10K and 10K405 forms in our sample and make no distinction between the two throughout the analysisRemove graphics—increasingly through time the filings have ASCII encoded graphics embedded in the file ASCII encoding of a graphic increases the size of a file by orders of magnitude For example the median file size for the year 2000 was approximately 270 KB and the largest filing without graphics was 57 MB Texas Utilities’ year 2000 filing included graphics and was 204 MBRemove abbreviations—counting words per sentence is important for the readability measures This is typically done by removing abbreviations and then counting the number of sentence terminators and the number of words For traditional text this is quite effective after eliminating a few common abbreviations Parsing financial disclosures however is much more difficult because they contain a variety of abbreviations and use periods to delineate section identifiers or as spacers Liberman and Church 1992 find that 47 of the periods occurring in the Wall Street Journal are associated with abbreviations We created a program that is more exhaustive in identifying abbreviations than the routine used in the PERL Fathom package Because the PERL Fathom package does not deeply parse for abbreviations it will tend to report more sentences than actually contained in a filing thus making the average number of words per sentence downward biasedConvert lists to sentences—as in the Fathom package our sentence count is based on the number of sentence terminators One challenge in parsing financial disclosures into sentences is that the documents often contain lists separated by semicolons or commas that should not be treated as a single sentence Redish 2000 notes the problem of measuring readability in texts with extensive lists Our program attempts to identify such lists based on punctuation and line spacing Where the program determines that a sequence of text is a list commas or semicolons delineating the list items are replaced with periods In addition to avoid counting the periods in section headers eg Sect 12 ellipses or other cases where a period is likely not terminating a sentence there must be at least 20 characters between two periods for the token to be treated as a sentenceCreating word and phrase counts—the cleaned document is next divided into tokens based on word boundaries using a regular expression Each token is compared with a master dictionary file to determine if the token is a word Only tokens of two or more letters are counted as words thus the words “I” and “a” are not counted The words for each document are then loaded into a dictionary for that specific filing containing the words and their counts Word counts are derived from this dictionary Phrases for the Plain English variable are identified by applying regular expressions to the cleaned document
Keywords:
.
|
Other Papers In This Journal:
|