Collaborative Intelligence in Accounting: A Human + AI Complementarity Framework for Professional Work

Authors

DOI:

https://doi.org/10.64044/vpm19y22

Keywords:

generative AI; large language models (LLMs); financial reporting; period-end close; human–AI collaboration; AI governance

Abstract

Background: Public discussion of generative artificial intelligence (AI) in accounting often swings between the allure of full automation and job-displacement anxiety, yet the most immediate reality in organizations is human + AI work: AI accelerates drafting, summarization, and pattern detection while professionals remain accountable for judgment, materiality, and defensibility in financial reporting and analysis.

Methods: This paper synthesizes recent research and practitioner guidance (2023–2025) to develop a practical model for designing human–AI collaboration, sometimes described as collaborative intelligence, in the financial reporting function (often referred to as controllership), including period-end close, financial statement preparation, variance explanation, management reporting narratives, and accounting policy documentation.

Results: The paper develops the C³ Framework—Complementarity, Controls, and Competencies—which maps accounting tasks by task structure and judgment/materiality to recommend collaboration modes, specifies five mandatory control points for high-judgment use cases (source grounding and traceability, independent verification and tie-out, contradiction testing, escalation and approval, and audit-trail logging), and proposes a role taxonomy that clarifies review responsibility, escalation thresholds, and evidence retention.

Conclusions: The C³ Framework provides implementable design patterns and testable propositions intended to help accounting leaders capture productivity gains from human + AI work while preserving accountability, consistency, and alignment with governance expectations in high-stakes reporting contexts.

References

Autor, D. H., Levy, F., & Murnane, R. J. (2003). The skill content of recent technological change: An empirical exploration. Quarterly Journal of Economics, 118(4), 1279–1333.

https://doi.org/10.1162/003355303322552801

Bai, X., Nunez, M., & Kalagnanam, J. R. (2012). Managing data quality risk in accounting information systems. Information Systems Research, 23(2), 453–473. https://doi.org/10.1287/isre.1110.0371

Bonner, S. E. (1994). A model of the effects of audit task complexity. Accounting, Organizations and Society, 19(3), 213–234. https://doi.org/10.1016/0361-3682(94)90033-7

Brynjolfsson, E., Li, D., & Raymond, L. (2025). Generative AI at work. The Quarterly Journal of Economics, 140(2), 889–942. https://doi.org/10.1093/qje/qjae044

Chen, L., Zaharia, M., & Zou, J. (2024). How Is ChatGPT’s Behavior Changing Over Time? Harvard Data Science Review, 6(2). https://doi.org/10.1162/99608f92.5317da47

Choi, J. H., & Xie, C. (2025). Human + AI in Accounting: Early Evidence from the Field. Working Papers (Faculty) - Stanford Graduate School of Business, 1-101. http://dx.doi.org/10.2139/ssrn.5240924

Committee of Sponsoring Organizations of the Treadway Commission. (2026). Achieving effective internal control over generative AI (GenAI). https://www.coso.org/generative-ai

CPA.com. (2025). CPA.com 2025 AI in Accounting Report.

https://www.cpa.com/sites/cpa/files/2025-06/2025_AI_in_Accounting_Report.pdf

Dong, M. M., Stratopoulos, T. C., & Wang, V. X. (2024). A scoping review of ChatGPT research in accounting and finance. International Journal of Accounting Information Systems, 55,100715. https://doi.org/10.1016/j.accinf.2024.100715

Farquhar, S., Kossen, J., Kuhn, L., & Gal, Y. (2024). Detecting hallucinations in large language models using semantic entropy. Nature, 630, 625–630. https://doi.org/10.1038/s41586-024-07421-0

Fulcer, K., Gu, H., Hu, H., Huang, Q., Kogan, A., Vasarhelyi, M. A., Wei, D., & Young, J. (2025). Application of outlier detection methods in audit analytics. Accounting Horizons, 39(3), 143–157. https://doi.org/10.2308/HORIZONS-2023-071

Ge, W., & McVay, S. (2005). The disclosure of material weaknesses in internal control after the Sarbanes-Oxley Act. Accounting Horizons, 19(3), 137–158. https://doi.org/10.2308/acch.2005.19.3.137

Gorry, G. A., & Scott Morton, M. S. (1971). A framework for management information systems. Sloan Management Review, 13(1), 55–70. https://dspace.mit.edu/handle/1721.1/47936

Hemmer, P., Schemmer, M., Kühl, N., Vössing, M., & Satzger, G. (2025). Complementarity in human-AI collaboration: Concept, sources, and evidence. European Journal of Information Systems, 34(6), 979-1002.

https://doi.org/10.1080/0960085X.2025.2475962

International Organization for Standardization & International Electrotechnical Commission. (2023). ISO/IEC 42001:2023—Artificial intelligence management systems—Requirements. ISO https://www.iso.org/standard/42001

Janvrin, D., & Mascha, M. F. (2014). The financial close process: Implications for future research. International Journal of Accounting Information Systems, 15(4), 381–399. https://doi.org/10.1016/j.accinf.2014.05.007

Jarrahi, M. H. (2018). Artificial intelligence and the future of work: Human-AI symbiosis in organizational decision making. Business Horizons, 61(4), 577–586. https://doi.org/10.1016/j.bushor.2018.03.007

Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y., Madotto, A., & Fung, P. (2023). Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12), Article 248. https://doi.org/10.1145/3571730

Kokina, J., Gilleran, R., Blanchette, S., & Stoddard, D. (2021). Accountant as digital innovator: Roles and competencies in the age of automation. Accounting Horizons, 35(1), 153–184. https://doi.org/10.2308/HORIZONS-19-145

Leitner-Hanetseder, S., Lehner, O. M., Eisl, C., & Forstenlechner, C. (2021). A profession in transition: Actors, tasks and roles in AI-based accounting. Journal of Applied Accounting Research, 22(3), 539–555. https://doi.org/10.1108/JAAR-10-2020-0201

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W.-t., Rocktäschel, T., Riedel, S., & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada.

https://proceedings.neurips.cc/paper/2020/file/6b493230205f780e1bc26945df7481e5-Paper.pdf

Li, H., & Vasarhelyi, M. A. (2024). Applying large language models in accounting: A comparative analysis of different methodologies and off-the-shelf examples. Journal of Emerging Technologies in Accounting, 21(2), 133–152.

https://doi.org/10.2308/JETA-2023-065

Lord, C. G., Lepper, M. R., & Preston, E. (1984). Considering the opposite: A corrective strategy for social judgment. Journal of Personality and Social Psychology, 47(6), 1231–1243. https://doi.org/10.1037/0022-3514.47.6.1231

Loughran, T., & McDonald, B. (2016). Textual analysis in accounting and finance: A survey. Journal of Accounting Research, 54(4), 1187–1230. https://doi.org/10.1111/1475-679X.12123

Lyell, D., & Coiera, E. (2017). Automation bias and verification complexity: A systematic review. Journal of the American Medical Informatics Association, 24(2), 423–431. https://doi.org/10.1093/jamia/ocw105

Messier, W. F., Jr., Martinov-Bennie, N., & Eilifsen, A. (2005). A review and integration of empirical research on materiality: Two decades later. Auditing: A Journal of Practice & Theory, 24(2), 153–187. https://doi.org/10.2308/aud.2005.24.2.153

National Institute of Standards and Technology. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0) (NIST AI 100-1). https://doi.org/10.6028/NIST.AI.100-1

National Institute of Standards and Technology. (2024). Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile (NIST AI 600-1). https://doi.org/10.6028/NIST.AI.600-1

Nelson, M. W. (2009). A model and literature review of professional skepticism in auditing. Auditing: A Journal of Practice & Theory, 28(2), 1–34. https://doi.org/10.2308/aud.2009.28.2.1

Noy, S., & Zhang, W. (2023). Experimental evidence on the productivity effects of generative artificial intelligence. Science, 381(6654), 187-192. https://doi.org/10.1126/science.adh2586

Raisch, S., & Krakowski, S. (2021). Artificial intelligence and management: The automation-augmentation paradox. Academy of Management Review, 46(1), 192–210. https://doi.org/10.5465/amr.2018.0072

Reason, J. (2000). Human error: Models and management. BMJ, 320(7237), 768–770. https://doi.org/10.1136/bmj.320.7237.768

Saghafian, S., & Idan, L. (2024). Effective generative AI: The human-algorithm centaur. Harvard Data Science Review. https://doi.org/10.1162/99608f92.19d78478

Vasarhelyi, M. A., Moffitt, K. C., Stewart, T., & Sunderland, D. (2023). Large language models: An emerging technology in accounting. Journal of Emerging Technologies in Accounting, 20(2), 1–10. https://doi.org/10.2308/JETA-2023-047

Vaccaro, M., Almaatouq, A., & Malone, T. (2024). When combinations of humans and AI are useful: A systematic review and meta-analysis. Nature Human Behaviour, 8, 2293–2303. https://doi.org/10.1038/s41562-024-02024-1

Wang, L., Chen, X., Deng, X., Wen, H., You, M., Liu, W., Li, Q., & Li, J. (2024). Prompt engineering in consistency and reliability with the evidence-based guideline for LLMs. npj Digital Medicine, 7, 41. https://doi.org/10.1038/s41746-024-01029-4

Wang, R. Y., & Strong, D. M. (1996). Beyond accuracy: What data quality means to data consumers. Journal of Management Information Systems, 12(4), 5–33. https://doi.org/10.1080/07421222.1996.11518099

Downloads

Published

05/09/2026