Research Dashboard

Automated surveillance of arXiv for my core research tracks.

1. Kinetic AI Risk

Scope: Intersection of Large Language Models (LLM) and ICS/SCADA.

Tango: Taming Visual Signals for Efficient Video Large Language Models

2026-04-10 | Shukang Yin, Sirui Zhao, Hanchao Wang...

Token pruning has emerged as a mainstream approach for developing efficient Video Large Language Models (Video LLMs). This work revisits and advances the two predominant token-pruning paradigms: attention-based selection and similarity-based clustering. Our study reveals two critical limitations in existing...

Large Language Models Generate Harmful Content Using a Distinct, Unified Mechanism

2026-04-10 | Hadas Orgad, Boyi Wei, Kaden Zheng, M...

Large language models (LLMs) undergo alignment training to avoid harmful behaviors, yet the resulting safeguards remain brittle: jailbreaks routinely bypass them, and fine-tuning on narrow domains can induce ``emergent misalignment'' that generalizes broadly. Whether this brittleness reflects a fundamental lack...

Trans-RAG: Query-Centric Vector Transformation for Secure Cross-Organizational Retrieval

2026-04-10 | Yu Liu, Kun Peng, Wenxiao Zhang, Fang...

Retrieval Augmented Generation (RAG) systems deployed across organizational boundaries face fundamental tensions between security, accuracy, and efficiency. Current encryption methods expose plaintext during decryption, while federated architectures prevent resource integration and incur substantial overhead. We introduce Trans-RAG, implementing a novel...

VisionFoundry: Teaching VLMs Visual Perception with Synthetic Images

2026-04-10 | Guanyu Zhou, Yida Yin, Wenhao Chai, S...

Vision-language models (VLMs) still struggle with visual perception tasks such as spatial understanding and viewpoint recognition. One plausible contributing factor is that natural image datasets provide limited supervision for low-level visual skills. This motivates a practical question: can targeted synthetic...

VL-Calibration: Decoupled Confidence Calibration for Large Vision-Language Models Reasoning

2026-04-10 | Wenyi Xiao, Xinchi Xu, Leilei Gan

Large Vision Language Models (LVLMs) achieve strong multimodal reasoning but frequently exhibit hallucinations and incorrect responses with high certainty, which hinders their usage in high-stakes domains. Existing verbalized confidence calibration methods, largely developed for text-only LLMs, typically optimize a single...

When LLMs Lag Behind: Knowledge Conflicts from Evolving APIs in Code Generation

2026-04-10 | Ahmed Nusayer Ashik, Shaowei Wang, Ts...

The rapid evolution of software libraries creates a significant challenge for Large Language Models (LLMs), whose static parametric knowledge often becomes stale post-training. While retrieval-augmented generation (RAG) is commonly used to provide up-to-date API specifications, "context-memory conflict" arises when external...

RIRF: Reasoning Image Restoration Framework

2026-04-10 | Wending Yan, Rongkai Zhang, Kaihua Ta...

Universal image restoration (UIR) aims to recover clean images from diverse and unknown degradations using a unified model. Existing UIR methods primarily focus on pixel reconstruction and often lack explicit diagnostic reasoning over degradation composition, severity, and scene semantics prior...

Strategic Algorithmic Monoculture:Experimental Evidence from Coordination Games

2026-04-10 | Gonzalo Ballestero, Hadi Hosseini, Sa...

AI agents increasingly operate in multi-agent environments where outcomes depend on coordination. We distinguish primary algorithmic monoculture -- baseline action similarity -- from strategic algorithmic monoculture, whereby agents adjust similarity in response to incentives. We implement a simple experimental design...

RecaLLM: Addressing the Lost-in-Thought Phenomenon with Explicit In-Context Retrieval

2026-04-10 | Kyle Whitecross, Negin Rahimi

We propose RecaLLM, a set of reasoning language models post-trained to make effective use of long-context information. In-context retrieval, which identifies relevant evidence from context, and reasoning are deeply intertwined: retrieval supports reasoning, while reasoning often determines what must be...

Policy-Aware Edge LLM-RAG Framework for Internet of Battlefield Things Mission Orchestration

2026-04-10 | Om Solanki, Lopamudra Praharaj, Deept...

Large Language Models (LLMs) offer a promising interface for intent-driven control of autonomous cyber-physical systems, but their direct use in mission-critical Internet of Battlefield Things (IoBT) environments raises significant safety, reliability, and policy-compliance concerns. This paper presents a Policy-Aware Large...

Agentic Jackal: Live Execution and Semantic Value Grounding for Text-to-JQL

2026-04-10 | Vishnu Murali, Anmol Gulati, Elias Lu...

Translating natural language into Jira Query Language (JQL) requires resolving ambiguous field references, instance-specific categorical values, and complex Boolean predicates. Single-pass LLMs cannot discover which categorical values (e.g., component names or fix versions) actually exist in a given Jira instance,...

E3-TIR: Enhanced Experience Exploitation for Tool-Integrated Reasoning

2026-04-10 | Weiyang Guo, Zesheng Shi, Liye Zhao, ...

While Large Language Models (LLMs) have demonstrated significant potential in Tool-Integrated Reasoning (TIR), existing training paradigms face significant limitations: Zero-RL suffers from inefficient exploration and mode degradation due to a lack of prior guidance, while SFT-then-RL is limited by high...

Confidence Without Competence in AI-Assisted Knowledge Work

2026-04-10 | Elena Eleftheriou, George Pallis, Mar...

Large Language Models (LLMs) are widely used by students, yet their tendency to provide fast and complete answers may discourage reflection and foster overconfidence. We examined how alternative LLM interaction designs support deeper thinking without excessively increasing cognitive burden. We...

Many-Tier Instruction Hierarchy in LLM Agents

2026-04-10 | Jingyu Zhang, Tianjian Li, William Ju...

Large language model agents receive instructions from many sources-system messages, user prompts, tool outputs, and more-each carrying different levels of trust and authority. When these instructions conflict, models must reliably follow the highest-privilege instruction to remain safe and effective. The...

UIPress: Bringing Optical Token Compression to UI-to-Code Generation

2026-04-10 | Dasen Dai, Shuoqi Li, Ronghao Chen, H...

UI-to-Code generation requires vision-language models (VLMs) to produce thousands of tokens of structured HTML/CSS from a single screenshot, making visual token efficiency critical. Existing compression methods either select tokens at inference time using task-agnostic heuristics, or zero out low-attention features...

2. GRC Engineering & AI Governance

Scope: AI Governance, Policy as Code, and Compliance Engineering.

From Abstract Threats to Institutional Realities: A Comparative Semantic Network Analysis of AI Securitisation in the US, EU, and China

2026-01-07 | Ruiyi Guo, Bodong Zhang

Artificial intelligence governance exhibits a striking paradox: while major jurisdictions converge rhetorically around concepts such as safety, risk, and accountability, their regulatory frameworks remain fundamentally divergent and mutually unintelligible. This paper argues that this fragmentation cannot be explained solely by...

From Slaves to Synths? Superintelligence and the Evolution of Legal Personality

2026-01-06 | Simon Chesterman

This essay examines the evolving concept of legal personality through the lens of recent developments in artificial intelligence and the possible emergence of superintelligence. Legal systems have long been open to extending personhood to non-human entities, most prominently corporations, for...

Compliance as a Trust Metric

2026-01-03 | Wenbo Wu, George Konstantinidis

Trust and Reputation Management Systems (TRMSs) are critical for the modern web, yet their reliance on subjective user ratings or narrow Quality of Service (QoS) metrics lacks objective grounding. Concurrently, while regulatory frameworks like GDPR and HIPAA provide objective behavioral...

Verifiable Off-Chain Governance

2025-12-29 | Jake Hartnell, Eugenio Battaglia

Current DAO governance praxis limits organizational expressivity and reduces complex organizational decisions to token-weighted voting due to on-chain computational limits. This paper proposes verifiable off-chain computation (leveraging Verifiable Services, TEEs, and ZK proofs) as a framework to transcend these constraints...

With Great Capabilities Come Great Responsibilities: Introducing the Agentic Risk & Capability Framework for Governing Agentic AI Systems

2025-12-22 | Shaun Khoo, Jessica Foo, Roy Ka-Wei Lee

Agentic AI systems present both significant opportunities and novel risks due to their capacity for autonomous action, encompassing tasks such as code execution, internet interaction, and file modification. This poses considerable challenges for effective organizational governance, particularly in comprehensively identifying,...

Computable Gap Assessment of Artificial Intelligence Governance in Children's Centres: Evidence-Mechanism-Governance-Indicator Modelling of UNICEF's Guidance on AI and Children 3.0 Based on the Graph-GAP Framework

2025-12-20 | Wei Meng

This paper tackles practical challenges in governing child centered artificial intelligence: policy texts state principles and requirements but often lack reproducible evidence anchors, explicit causal pathways, executable governance toolchains, and computable audit metrics. We propose Graph-GAP, a methodology that decomposes...

The Future of the AI Summit Series

2025-12-19 | Lucia Velasco, Charles Martinet, Henr...

This policy memo examines the evolution of the international AI Summit series, initiated at Bletchley Park in 2023 and continued through Seoul in 2024 and Paris in 2025, as a forum for cooperation on the governance of advanced artificial intelligence....

Smart Data Portfolios: A Quantitative Framework for Input Governance in AI

2025-12-18 | A. Talha Yalta, A. Yasemin Yalta

Growing concerns about fairness, privacy, robustness, and transparency have made it a central expectation of AI governance that automated decisions be explainable by institutions and intelligible to affected parties. We introduce the Smart Data Portfolio (SDP) framework, which treats data...

How frontier AI companies could implement an internal audit function

2025-12-16 | Francesca Gomez, Adam Buick, Leah Fer...

Frontier AI developers operate at the intersection of rapid technical progress, extreme risk exposure, and growing regulatory scrutiny. While a range of external evaluations and safety frameworks have emerged, comparatively little attention has been paid to how internal organizational assurance...