Website Governance in the AI Era: Why Your Site Belongs Under IT

Two Models, One Lesson
I have worked on both sides of this debate.
The .edu Model: Communications-Led
In the .edu world, the website lived under Communications. Marketing owned the CMS, controlled the editorial calendar, and published without IT review. The sites looked good. They also accumulated technical debt in silence — aging infrastructure, no unified data layer, fragmented integrations — because no one with architectural authority had governance responsibility.
The .gov Model: IT-Led
In the .gov world, IT owned the full stack. Hosting, access controls, security reviews, compliance documentation — all under the CTO. Marketing submitted content through a defined request process. Publishing was slower. The infrastructure was also dramatically more secure, more consistent, and easier to extend when requirements changed.
Both models worked, in context. A communications-led approach served organizations well when the website functioned primarily as a publishing platform. An IT-led model made sense where compliance and system integrity were operational requirements. For most of my career, this was a question of organizational fit.
It is no longer a question of organizational fit.
Your Website Is Now a Data Infrastructure Layer
Your website is now a data infrastructure layer — the primary source that AI systems crawl, parse, and use to represent your organization. ChatGPT's crawler alone generates 3.6 times more requests than Googlebot, Bingbot, and Amazonbot combined. Google AI Overviews reach 1.5 billion users monthly across more than 200 countries. When someone asks an AI assistant about your organization, the answer is assembled from your structured data, your schema markup, and your published content — whether you intended it to be or not.
This changes the governance calculus entirely.
Schema Drift and Stealth Crawlers
Misaligned schema markup — structured data that drifts out of sync with visible page content — causes AI systems to misrepresent your organization or stop citing you altogether. Perplexity has been documented using undeclared, stealth crawlers that rotate through generic browser user-agents to bypass robots.txt restrictions, meaning your no-crawl directives are being circumvented by systems you may not even know are indexing you. Your website's API surface area is expanding with every integration, and roughly 70 percent of enterprises have only 30 percent of their APIs properly documented — creating blind spots that no communications team is equipped to manage.
The Regulatory Environment Is Already Here
Meanwhile, the regulatory environment is converging on exactly this point. NIST SP 800-53 treats websites as components of information systems subject to security and privacy controls. CMMC requires documented data categorization, system inventories, and assigned ownership for every system handling sensitive data. The EU AI Act's high-risk system requirements take effect in August 2026. California's AB 2013 already requires generative AI developers to publish training data summaries. These frameworks do not distinguish between your internal systems and your public website — and neither do the AI systems consuming your content.
This Is an Infrastructure Function
The governance question is no longer about who writes the copy. It is about who owns the data layer, who manages the crawl policies, who maintains schema integrity, who documents the API surface, and who is accountable when an AI system misrepresents your organization to a regulatory body, a prospective client, or a court.
That is not a communications function. It is an infrastructure function. Your website should be governed by your CISO or CTO, with marketing as a close collaborator on content and messaging — not the other way around.
Start for Free
Not sure where your website governance stands? Create a free Commonwealth Creative account to access our governance audit checklist, AI-readiness resources, and expert guidance — no commitment required.
