← Back to homepage

Jiaju Han

Research assistant intern at the Shenzhen Institute of Big Data, working on LLM applications for power systems and energy intelligence.

Email: 1404293404 [at] qq [dot] com · arXiv: 2607.06552 · 2607.06485 · 2607.07288 · 2606.17020 · 2605.22273 · 2603.28568

Education

China University of Petroleum (Beijing)

B.Eng. in Data Science and Big Data Technology (Second Bachelor's Degree), Sep 2025 - Jun 2027

China University of Petroleum (Beijing)

B.A. in English, Sep 2021 - Jul 2025

Research Experience

Shenzhen Institute of Big Data

Research Assistant Intern · Jul 2026 - Present

Work on large language model applications for power systems and energy intelligence.
Contribute to research problem formulation, experiment design, and evaluation.

Research Projects

CFGPatch: GEOMETRIC ADVERSARIAL FRAMEWORK FOR VISIBLE-INFRARED VLMs

May 2026 - Completed · Co-author · Submitted to NeurIPS

Contributed to a unified geometric adversarial framework for visible-infrared vision-language models.
Studied cross-task transferability across visible and infrared settings with shared fractal geometry construction.
Participated in experiment organization, result analysis, and manuscript preparation.

Atmospheric Retrieval Hijacking in Remote Sensing Vision-Language RAG

May 2026 - Completed · First author · Submitted to NeurIPS

Led a study on atmospheric-triggered retrieval hijacking in remote sensing vision-language RAG systems.
Designed attack scenarios, retrieval evaluation protocols, and analysis for hallucination-inducing evidence retrieval.
Coordinated manuscript writing and experimental validation across multimodal retrieval and generation settings.

XSPA: SPARSE ADVERSARIAL PERTURBATIONS FOR TRANSFERABLE ATTACKS ON VLMs

Oct 2025 - Completed · Second author · Submitted to ACM MM

Contributed to implementation, experiments, evaluation pipeline, and manuscript preparation.
Implemented a fixed X-shaped sparse perturbation pipeline with joint optimization over classification attack objectives, semantic attraction, semantic suppression, and smoothness regularization.
Built batch evaluation workflows across zero-shot classification, image captioning, and VQA settings on multiple CLIP-style encoders and downstream VLMs.
Achieved substantial degradation while perturbing about 1.76% of pixels at 224x224 resolution.

LLM-Based Two-Stage Credit Rating Explanation Generation System

Nov 2025 - Present

Led a two-stage pipeline for rating prediction, explanation generation, automatic judging, and preference optimization.
Improved rating classification to 89.54% accuracy and 88.80% Macro-F1 through data auditing and feature enhancement.
Built a Llama-3.1-8B-Instruct + LoRA explanation generator covering SFT, candidate generation, LLM-as-judge evaluation, and DPO signal construction.
Reached a 4.8675 / 5 average judge score and 98.8% pass rate on the test set.

ScholarRAG-ZH: Citation-Grounded QA Benchmark and RAG Evaluation Toolkit

Apr 2026 - Present

Designed and implemented a benchmark and evaluation toolkit for academic RAG over TeX, Markdown, and PDF sources.
Built an evidence-annotated benchmark with manually verified queries and metrics covering hit rate, citation precision/recall, answer accuracy, and faithfulness.
Compared lexical retrieval, BM25, latent semantic retrieval, sentence-transformers embeddings, cross-encoder reranking, and structure-aware retrieval.
Achieved 0.92 answer accuracy with structure-aware retrieval on the draft-derived split.

Publications

InfraQR: Edge-Placed QR-Inspired Structured Patch Attacks on Infrared Vision-Language Models

arXiv: https://arxiv.org/abs/2607.07288

Author position: Second author

MonoIR-RS: Infrared Remote Sensing Vision-Language Learning with CLIP and VLM Adaptation

arXiv: https://arxiv.org/abs/2607.06552

Submission: ACCV · Author position: First author

AirflowAttack: Thermal-Airflow Adversarial Perturbations against Infrared Remote-Sensing Vision-Language Models

arXiv: https://arxiv.org/abs/2607.06485

Submission: ACCV · Author position: Second author

FusionRS: A Large-Scale RGB-Infrared Remote Sensing Dataset for Dual-Modal Vision-Language Foundation Models

arXiv: https://arxiv.org/abs/2606.17020

Submission: EMNLP · Author position: First author

EXPOSING VULNERABILITIES IN VISIBLE-INFRARED VLMs: A UNIFIED GEOMETRIC ADVERSARIAL FRAMEWORK WITH CROSS-TASK TRANSFERABILITY

arXiv: https://arxiv.org/abs/2605.22273

Submission: NeurIPS · Author line includes Jiaju Han as fifth author.

FROM CLOUDS TO HALLUCINATIONS: ATMOSPHERIC RETRIEVAL HIJACKING IN REMOTE SENSING VISION-LANGUAGE RAG

arXiv: https://arxiv.org/abs/2605.07273

Submission: NeurIPS · Author position: First author

XSPA: CRAFTING IMPERCEPTIBLE X-SHAPED SPARSE ADVERSARIAL PERTURBATIONS FOR TRANSFERABLE ATTACKS ON VLMs

arXiv: https://arxiv.org/abs/2603.28568

Submission: ACM MM · Author position: Second author

WHEN SURFACES LIE: EXPLOITING WRINKLE-INDUCED ATTENTION SHIFT TO ATTACK VISION-LANGUAGE MODELS

arXiv: https://arxiv.org/abs/2603.27759

Submission: ACM MM · Author line includes Jiaju Han as third author.

Skills

Python, PyTorch, Hugging Face, LoRA, SFT, DPO, LLM-as-judge, RAG, BM25, sentence-transformers, cross-encoder reranking, Pandas, Matplotlib, Spark, Hadoop, benchmark design, evaluation pipelines, technical writing.