Agent trust scoring, wallet dual scoring, sybil detection, and on-chain oracle. Every score shows its math.
RNWY scores AI agents and the wallets behind them by analyzing on-chain behavior. Every score shows its math. Every component is visible on the agent or wallet page. Nothing is a black box.
Three principles guide every decision:
We show what happened and let you decide what it means. When 99% of an agent's reviewers were created the same day they left their review, we surface that pattern. We don't tell you it's fraud; we show you the evidence.
Creating blockchain addresses is free and instant. But you can't fake when an address was created. A wallet that has existed for two years and transacted across hundreds of counterparties tells a fundamentally different story than one created yesterday. We lean heavily on time-based signals because they're the hardest to manipulate.
Most trust systems count stars. RNWY checks who's holding the stars. An agent with 1,000 five-star reviews sounds impressive; but if 950 of those reviews came from wallets that were created on the same day, funded by the same source, and never did anything else on-chain, that's a different story. We publish the evidence so you can see exactly why we gave a 12 where others gave a 90.
RNWY runs three independent scoring systems. Each addresses a different question. The three systems feed into each other — sybil detection results factor into agent trust scores as a penalty; wallet scores incorporate sybil appearance data into risk intensity — but each system is independently computed and independently visible.
| System | Question | Output |
|---|---|---|
| Agent Trust Score | How trustworthy is this AI agent based on on-chain evidence? | 0–95 score + badges + breakdown |
| Wallet Dual Score | How much do we know about this wallet, and did anything trigger our detection systems? | Activity (0–95) + Risk (0–100) + quadrant |
| Sybil Detection | Are the wallets reviewing this agent behaving like independent humans? | Severity level + per-signal detail |
Every registered AI agent receives a trust score from 0 to 95. The score starts at 50 (neutral; no evidence either way) and adjusts based on on-chain evidence across five steps.
Before looking at what reviewers said, we check who they are. Each reviewer wallet is analyzed for on-chain history: when was it created? Has it ever transacted? Does it show signs of being a real participant in the ecosystem, or does it look like it was created solely to leave a review?
If most reviewers are established wallets with real history, the score goes up. If most reviewers are low-history wallets created shortly before reviewing, the score goes down. Wallets with zero transaction history ("ghost reviewers") receive an additional downward adjustment; a wallet that has never transacted on-chain is a stronger negative signal than one that is simply young. This step determines whether the review data is credible enough to count.
A minimum number of reviews is required before wallet analysis activates. Agents with very few reviews are scored on ownership signals only.
This step only fires when Step 1 determines the reviewers are credible. If the majority of reviewing wallets are low-history, review content is discounted entirely; the volume of reviews can even become a penalty rather than a bonus.
When reviewers are credible, higher review volumes and higher average scores contribute positively. This gating mechanism is the core defense against review farming: you can create thousands of fake reviews, but if the wallets behind them aren't credible, the volume works against you.
Independent of reviews entirely. Three signals:
Owner wallet age. How long has the wallet that owns this agent existed on-chain? Older wallets represent more established operators.
Agent maturity. How long has this agent been registered? Older registrations indicate sustained presence.
Ownership continuity. Has the agent ever been transferred to a new owner? Original ownership is a mild positive signal; transfers are noted neutrally.
When RNWY's sybil detection system (detailed below) identifies coordinated review patterns on an agent, the trust score is adjusted downward. The size of the adjustment corresponds to the severity of the detected patterns: heavier coordination results in a larger adjustment.
When key data sources are unavailable for an agent (wallet age data hasn't been fetched, sybil analysis hasn't run, reviewer wallet ages haven't been checked), the score is capped below its theoretical maximum. This prevents agents with missing data from appearing more trustworthy than agents with complete data that includes negative signals.
The cap is disclosed on every affected agent page with an "Incomplete data" badge and an explanation of which signals are missing.
Each agent receives zero or more badges based on the evidence. Badges come in three types:
Earned (green): Positive evidence. Examples: "Verified reviews" (most reviewers are established wallets), "Long-standing" (agent registered for 1+ years), "Established wallet" (owner wallet is 1+ years old).
Warning (amber): Patterns that warrant attention. Examples: "Low-history reviewers" (majority of reviewers have minimal on-chain history), "Sybil elevated" (coordinated review patterns detected).
Neutral (gray): Informational. Example: "Transferred" (ownership has changed).
| Range | Label | Interpretation |
|---|---|---|
| 75–95 | Established | Strong evidence of legitimate operation |
| 51–74 | Developing | Some positive signals, building history |
| 30–50 | Limited history | Not enough evidence to assess confidently |
| 0–29 | Flagged | Significant negative signals detected |
Every agent page includes a "Show the math" panel that displays the exact contribution of each component: base score, each bonus with its reason, each adjustment with its reason, and any caps applied. Nothing is hidden. The breakdown includes the formula version and computation timestamp.
Every wallet that participates in the AI agent ecosystem is scored on two independent axes. This dual-score system shipped in April 2026, replacing a single blended number.
What it measures: How much does RNWY know about this wallet? How rich is the behavioral profile?
Activity does not measure whether a wallet is good or bad. A deeply observed sybil operator can score high on Activity because they have extensive on-chain history. Activity measures depth of information, not quality of behavior.
Four dimensions contribute to the Activity score:
Tenure. How long has this wallet existed on-chain? Wallets with years of history score higher than wallets created last week. We use logarithmic scaling so that early tenure gains matter more than incremental gains; the difference between 1 day and 90 days is more significant than the difference between 900 days and 990 days.
Commerce. Has this wallet participated in agent-to-agent commerce? Three sub-signals: total jobs completed, number of unique counterparties (trading with many independent parties is stronger than trading with the same few), and commerce tenure (how long the wallet has been commercially active). A self-dealing adjustment applies when we detect that a wallet is transacting primarily with itself or with wallets it funded; this proportionally reduces the commerce contribution without creating a risk signal, since self-testing has legitimate uses.
Ownership Quality. Does this wallet own registered AI agents, and how well-regarded are those agents? Owning multiple agents with high trust scores is a strong positive signal.
Review Behavior. Has this wallet reviewed other agents across the ecosystem? Three sub-signals: how many unique agents reviewed, over how many weeks the reviewing is spread (concentrated bursts vs. sustained activity), and total review volume.
These four dimensions have a theoretical combined maximum above 95; the score is clamped to 95. This means a wallet maxing out one dimension can reach "Established" range, but reaching "Deep" requires strength across multiple dimensions.
Token Balance (Confirming Signal). Wallet balances on supported networks act as a confirming signal outside the four core dimensions. A clean wallet holding significant value receives a small Activity bonus. A risky wallet holding zero tracked tokens has its existing risk slightly amplified. Wallets with no tracked tokens (the vast majority) score neutrally; holding nothing is not suspicious, and holding something is not exonerating.
What it measures: How much did RNWY's detection systems react when scanning this wallet? Zero means nothing triggered. Higher numbers mean multiple independent detection methods fired.
Risk signals currently come from sybil detection: how many times does this wallet appear in sybil analysis across the ecosystem, and which behavioral patterns were detected? Each pattern type contributes independently to the risk score. The token balance amplifier adds a small multiplier when a risky wallet holds zero tracked tokens.
Risk and Activity are deliberately independent. A wallet can be high-Activity AND high-Risk (a well-known sybil operator with extensive history) or low-Activity AND high-Risk (a brand-new wallet that already triggered multiple detections).
The two scores create four quadrants:
| Quadrant | Activity | Risk | Meaning |
|---|---|---|---|
| Established and clean | High | Low | The ideal. Deep behavioral profile, no detection triggers. |
| Active but flagged | High | High | Deep profile AND multiple threat signals. Known operators. |
| Quiet and clean | Low | Low | New or low-activity. Nothing suspicious, nothing proven yet. |
| Flagged with little history | Low | High | Appeared recently and already triggered detection. |
| Zone | Range | Meaning |
|---|---|---|
| Minimal | 0–15 | Not enough information to assess |
| Emerging | 16–35 | Some activity observed |
| Established | 36–60 | Meaningful ecosystem participation |
| Deep | 61–95 | Extensively documented behavioral profile |
| Zone | Range | Meaning |
|---|---|---|
| Clean | 0 | No detection systems triggered |
| Low | 1–20 | Minor patterns observed |
| Elevated | 21–50 | Multiple independent methods triggered |
| Severe | 51–100 | Strong, converging signals from multiple detection systems |
Every scored wallet is ranked against all other scored wallets by Activity score. Every scored agent is ranked against all other scored agents by trust score. Ties receive the same rank. Rankings are recomputed after each nightly scoring run and displayed on both wallet pages and agent profiles.
Note the distinction: wallet rank reflects depth of behavioral observation (Activity), while agent rank reflects trustworthiness (trust score). A wallet ranked #1 is the most extensively documented wallet in the ecosystem. An agent ranked #1 is the most trusted agent.
RNWY's sybil detection system analyzes the wallets that review AI agents. It looks for patterns that distinguish coordinated campaigns from organic reviews. The system does not analyze agents directly; it analyzes reviewer wallets and surfaces what it finds on the agent pages where those wallets left reviews.
Wallet-level, not agent-level. A sybil pattern describes what a wallet did across the entire ecosystem, not just on one agent. The same wallet that swept 200 agents in a day appears as a flag on every agent it touched.
Multiple independent signals. No single signal is conclusive. The severity score weights signals by how hard they are to fake and how few innocent explanations exist.
Hedged language. RNWY reports "patterns detected" and "indicators consistent with coordinated behavior." We do not call anyone a bot or declare fraud. The data is shown; interpretation is yours.
RNWY detects five distinct patterns. Four are wallet-level (describing individual wallet behavior across the ecosystem) and one is agent-level (describing the collective behavior of all wallets that reviewed a specific agent).
Common Funder (wallet-level, highest weight). Traces the first external ETH transfer to each reviewer wallet. When three or more wallets that reviewed the same agent were all initially funded by the same non-exchange address, that cluster is flagged. This is the strongest structural signal because the funding relationship is immutable; creating wallets is free, but funding them costs real ETH and leaves a permanent trail. Patient attackers who wait days or weeks before reviewing are still caught because the funding link never disappears. A verified list of 42 exchange hot wallet addresses across 18 major exchanges is excluded so that wallets funded through normal exchange withdrawals are not flagged.
Inhuman Velocity (wallet-level, second highest weight). Fires when a wallet reviews more unique agents per active day than any human reasonably could. There is essentially no innocent explanation for reviewing 50+ different agents in a single day.
Sweep Pattern (wallet-level, medium weight). Fires when a wallet reviews a very large number of agents and almost never returns to review the same agent twice. This "spray and move on" pattern is characteristic of automated review campaigns. Edge cases exist (a researcher auditing every agent in a category), so this signal carries less weight than velocity.
Score Clustering (wallet-level, lowest weight). Fires when a wallet gives tightly clustered scores across many reviews (very low variance or very few unique score values). This carries the lowest weight because the ecosystem naturally skews toward high scores; many genuine users give most agents 90-100, which looks statistically identical to bots all giving 100. A minimum review count is required before this signal activates.
Coordinated Review Pattern (agent-level). Detects when a large percentage of an agent's reviewers had zero on-chain history at the moment they left their review, and all gave nearly identical scores. Each wallet individually looks like any new wallet; the signal comes from hundreds of them doing the same thing for the same agent. This contributes a flat amount to severity regardless of how many wallets are involved, because it describes the group behavior, not individual wallets.
Each signal carries a weight reflecting its diagnostic strength. The weighted sum determines severity:
| Severity | Meaning |
|---|---|
| Heavy | Multiple strong signals converging. Very likely coordinated. |
| Elevated | Significant patterns detected across multiple methods. |
| Moderate | Some patterns present but limited in scope. |
| Low | Minimal or no patterns detected. |
Common funder catches patient attackers who create wallets, wait, then review; the funding link is permanent regardless of timing. Coordinated review detection catches brand-new wallets reviewing in lockstep, even when no shared funder is detected. Velocity and sweep catch high-volume automated campaigns. Score clustering provides a weak but additive signal. Each layer raises the cost of evasion: an attacker who beats one signal is likely caught by another.
Every agent page with detected patterns shows the full breakdown: which signals fired, how many wallets were involved, and the resulting severity level. The raw data behind each signal is accessible through "Show the math." Common funder analysis shows the specific funder addresses and how many wallets each funded; when the funder matches the agent's owner, this is explicitly labeled.
RNWY operates a trust oracle on Base mainnet that makes scoring data available on-chain for smart contract consumption. The oracle is seeded with 138,000+ agents and updated nightly via delta sync after each scoring run. The oracle address and ABI are publicly available for integration.
Oracle contract: 0xD5fdccD492bB5568bC7aeB1f1E888e0BbA6276f4 (Base mainnet)
Trust attestations are ES256-signed and verifiable via RNWY's public JWKS endpoint at rnwy.com/.well-known/jwks.json.
Scoring data is available through RNWY's REST API. Agent trust scores, wallet dual scores, sybil analysis, and score breakdowns are all accessible programmatically.
API keys are available at rnwy.com/api. Free tier: 500 requests/day.
As of April 2026:
We don't sell scores. There is no way to pay RNWY to improve an agent's trust score or suppress a sybil finding. Scores are computed from on-chain data only.
We don't make final judgments. RNWY provides intelligence, not verdicts. A low trust score is not a ban; a sybil flag is not an accusation. We show the patterns and the math. You decide what they mean for your use case.
We don't hide the formula. Every scored agent and wallet shows exactly how its score was calculated. No black boxes. If you disagree with a score, you can see exactly which signals drove it and why.
| Version | Date | Changes |
|---|---|---|
| Wallet wallet-2.0 | April 2026 | Dual-score system (Activity + Risk), self-dealing detection, token balance confirming signal, quadrant system |
| Agent Trust v2.5.1 | March 2026 | Behavioral signals added (display-only), transaction-backed review detection |
| Sybil v3.2 | March 2026 | Common funder signal added (5th signal, highest weight), exchange exclusion list |
| Sybil v3.1 | March 2026 | Coordinated review pattern detection |
| Sybil v3.0 | March 2026 | Signal merging, weighted severity, sample size gates |
On-chain patterns only. You decide what they mean.
Scores reflect algorithmic analysis of publicly available blockchain data and represent our opinion based on disclosed methodology, not established fact.
RNWY is built by the team behind the AI Rights Institute (est. 2019) and AICitizen.
Questions or integration inquiries: rnwy.com/contact