Summary: Discussing Applications of Generative AI to Rule Development and Evaluation

January 31, 2024

Authored by:

Mark Febrizio

More coverage from the "Building On Regulatory Foundations" event

On November 16, 2023, the George Washington University Regulatory Studies Center and the IBM Center for the Business of Government co-hosted an event, Building on Regulatory Foundations and Bridging to the Future, commemorating the 30th anniversary of Executive Order 12866 and 20th anniversary of Circular A-4. Taking place a couple weeks before ChatGPT’s first birthday, the event featured several breakout sessions, including one focused on generative artificial intelligence (AI) and its applications to rule development and evaluation.[1] While the discussion covered a variety of topics, several recurring themes emerged relating to the potential benefits and risks associated with the use of AI in the rulemaking process.

Dr. David Bray, Distinguished Fellow and Loomis Council Co-Chair at the Stimson Center, and Andy Fois, Chair of the Administrative Conference of the United States (ACUS), facilitated the generative AI breakout session, which was attended by participants from government agencies, academia, private sector companies, and non-profits. In general, the discussion primarily focused on AI’s use during the public commenting stage of rulemaking, rather than another phase like proposal development or retrospective review. Specifically, much of the conversation centered on how agencies should respond to receiving comments that are entirely or partly generated by AI and how they might process comments using AI tools.

Building On Regulatory Foundations - GenAI

During a wide-ranging discussion, several prominent themes emerged. First, agencies use different approaches when addressing AI-generated comments, with consequential results. The Federal Communication Commission’s (FCC) Electronic Comment Filing System and the General Services Administration’s (GSA) eRulemaking program (which is used by dozens of federal agencies as a shared service) represent the two broad strokes of public commenting processes. Participants discussed how back in the mid-2010s different government agencies experienced spikes in what appeared to be a mixture of human- and bot-submitted public comments, although an exceedingly small number of rulemaking efforts saw more than 10,000 comments. A 2021 analysis of a spike in public comments impacting the Environmental Protection Agency observed that “the 2002 E-Government Act did not anticipate the emergence of bots and thus fails to provide agencies with sufficient guidance on how to identify and treat bots and fake comments.”

Participants observed that the FCC had legally interpreted the Administrative Procedure Act (APA) of 1946 and related policies in a manner that effectively gave senior management less discretion to address the risk of comment surges – and precluded the FCC from being able to adopt the GSA’s eRulemaking program. During the mid-2010s, the then-CIO had attempted to make the case for the FCC to adopt the eRulemaking program and not succeeded. This stemmed from the FCC’s interpretations of the APA that prioritized real-time viewing of comments rather than waiting to post submissions until they are processed, acceptance of all comments even if they were perceived as potential spam, allowance of anonymous comments or comments with no identification checks, a reluctance to use CAPTCHA, and a strong push by external parties for the ability to submit comments in bulk. While the FCC’s legacy Electronic Comment Filing System eventually moved to a cloud-based service that included API rate limits for comment submissions, it employed a GSA service that before 2017 did not monitor API key requests for multiple registrations. After 2017, GSA’s public-facing platform, located at regulations.gov, successfully implemented techniques such as CAPTCHA and API rate limits to mitigate the risk of being overwhelmed by automated submissions. The FCC since 2017 has made some adjustments too.

Participants also discussed how the 2017 net neutrality rulemaking demonstrated that the FCC’s interpretations of the APA made it technically at risk of astroturfing – defined as organized activity that falsely attempts to pass itself off as a grassroots movement. The Commission’s proposal received nearly 23 million comments in 2017, requiring the FCC to scale its cloud-based systems more than 3,000 percent to address the flood of comments. In 2021, the New York Attorney General identified at least 18 million of these comments as not authentic. Since 2017, regulations.gov has not experienced the same issues, although some rulemakings routinely receive large volumes of mass submissions (though none that approach the scale of 23 million comments). Comparing different legal interpretations of the APA and the downstream impact on technical implementations highlighted how policy decisions may prevent the commenting process from being overwhelmed by bots or generative AI submissions.

A second theme was that generative AI presents opportunities for how agencies use it for processing comments. Participants mentioned the need for internal controls on processing and adjudicating comments, wondering how an agency should respond if a commenter claims their comment was not adequately considered by an AI system. One point that was brought up is that considering a comment is not akin to a vote or directly following its advice. For instance, an agency may consider the claims presented by critical comments and still determine that the proposed rule is worth pursuing regardless – public commenting does not overrule agency decisions. Nevertheless, the conversation routinely returned to the difficulty of defining what it means to both consider comments when AI is used and prove that the comments have been considered fully. In this instance, the FCC in the mid-2010s had a small success story in making all comments received during the 2014 and 2017 net neutrality discussions publicly available as a downloadable data set for others to analyze. As agencies work through this question, ensuring their policies for using AI comply with the APA and offering guidance and direction in interpretations and implementations is essential.

One participant suggested that agencies using AI systems decouple the procedures for processing comments from those for reviewing their substance to mitigate the issue of whether they have adequately considered public comments. While AI may serve a key role in categorizing, filtering, and summarizing public submissions, human involvement remains necessary in the latter stage of evaluating substantive feedback. On this topic, participants also deliberated about the role of transparency in how agencies adopt and use AI technologies. Like other technologies, relying on AI too much could distort the rulemaking process, so establishing boundaries and usage policies would aid public accountability. Generally, participants agreed that clear guidance from a body like the Office of Management and Budget on this matter would be valuable.

Third, participants considered on how to counteract potential negative effects of AI on the rulemaking process, while capturing its benefits for regulatory development. To mitigate risks, participants highlighted the value of red-teaming in predicting how bad actors might use AI technologies in the public commenting process. In fact, President Biden’s executive order from October 2023 on the development and use of AI incorporated “AI red-teaming” – defined as “a structured testing effort to find flaws and vulnerabilities in an AI system” – into several directives. Session participants mentioned how having a consistent process for handling public comments across agencies, built upon compatible interpretations of the APA, would make red-teaming activities even more valuable, since the results would apply in more contexts and fixing a vulnerability for one agency would fix it for all.

When discussing AI’s potential benefits to rule development, several participants posed whether such tools could help agencies formulate regulation. At the very least, multiple participants thought that AI could play a substantial role in synthesizing the data and science that inform rules. Finally, individuals discussed how agencies can take proactive steps to better position themselves to reap the benefits of advanced AI capabilities. One example provided was how DOT's experience with using a structured format for its rules aided in retrospective review by facilitating the use of rule text as data. Greater accessibility of machine-readable rule text could play a role in leveraging AI systems to conduct ex ante and ex post analyses of regulations.

Ultimately, Generative AI poses numerous considerations for rule development and evaluation, particularly for public commenting on agency rules. Themes discussed in this session highlighted the need for agencies to be attentive and responsive to modern technologies and adapt their practices accordingly.

[1] Participants in the breakout sessions agreed to abide by the Chatham House Rule of freely sharing the ideas and information discussed in the sessions without attributing them to a particular person or their affiliated organization.