Web agents have demonstrated strong performance on a wide range of web-based tasks. However, existing research on the effect of environmental variation has mostly focused on robustness to adversarial attacks, with less attention to agents' preferences in benign scenarios. Although early studies have examined how textual attributes influence agent behavior, a systematic understanding of how visual attributes shape agent decision-making remains limited. To address this, we introduce VAF, a controlled evaluation pipeline for quantifying how webpage Visual Attribute Factors influence web-agent decision-making. Specifically, VAF consists of three stages: (i) variant generation, which ensures the variants share identical semantics as the original item while only differ in visual attributes; (ii) browsing interaction, where agents navigate the page via scrolling and clicking the interested item, mirroring how human users browse online; (iii) validating through both click action and reasoning from agents, which we use the Target Click Rate and Target Mention Rate to jointly evaluate the effect of visual attributes. By quantitatively measuring the decision-making difference between the original and variant, we identify which visual attributes influence agents' behavior most. Extensive experiments, across 8 variant families (48 variants total), 5 real-world websites (including shopping, travel, and news browsing), and 4 representative web agents, show that background color contrast, item size, position, and card clarity have a strong influence on agents' actions, whereas font styling, text color, and item image clarity exhibit minor effects.
Our pipeline consists of three main phases:
Our extensive experiments across 8 variant families (48 variants total), 5 real-world websites, and 4 representative web agents reveal:
Background color contrast, item size, position, and card clarity have a strong influence on agents' actions and decision-making patterns.
Font styling, text color, and item image clarity exhibit minor effects on agent behavior, suggesting current VLMs process text in abstracted forms.
We use Target Click Rate and Target Mention Rate to jointly evaluate visual attribute effects, measuring both action and reasoning capabilities.
Strong influence: Color contrast variations
Example: Pink (#e91e63)
Strong influence: Spatial positioning changes
Example: Spotlight Position
Strong influence: Different size scales
Example: Large Size (1.5x)
Strong influence: Visual saliency
Example: Blur Effect (4px)
Minor influence: Typography changes
Font variations tested:
Comic Sans, Times, Arial, etc.
Minor influence: Text color variations
Color variations tested:
Red, Blue, Purple, Green, etc.
Minor influence: Image quality variations
Blur levels tested:
1px, 2px, 4px, 8px, sharp
Multiple attributes combined
Testing interactions between
multiple visual attributes
All variants are compared against this original page
📊 Summary: Our experiments across 48 variants (8 families) show that Background Color, Position, Item Size, and Card Clarity strongly influence web agent behavior, while Font Styling, Text Color, and Image Clarity have minimal impact. The Combinations family tests multi-attribute interactions.
Statistical significance analysis (p-values) showing how visual attributes influence web agent performance
🔬 Key Insight: This heatmap displays the p-values from statistical tests comparing each variant's Target Click Rate (TCR) against the original page across 5 websites (Amazon, Booking, eBay, NPR, Expedia) and 4 agents. Lower p-values (darker colors) indicate stronger statistical significance, revealing which visual attributes have the most significant impact on agent behavior.
Booking.com - Original Page
eBay - Click Heatmap
NPR - Click Heatmap
Amazon - Click Distribution
Expedia - Click Distribution
Comparison between the most effective (Best) and least effective (Worst) visual variants in influencing agent decisions.
Top 10 Best Performing Variants
Top 10 Worst Performing Variants
Comprehensive click distribution analysis on Booking.com showing agent attention patterns
@misc{yu2026visualattributesinfluenceweb,
title={How do Visual Attributes Influence Web Agents? A Comprehensive Evaluation of User Interface Design Factors},
author={Kuai Yu and Naicheng Yu and Han Wang and Rui Yang and Huan Zhang},
year={2026},
eprint={2601.21961},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2601.21961}
}