deepluq.metrics_vla#

Classes#

TokenMetrics

OutputMetrics

Compute various instability and variability metrics for robot actions and TCP positions.

Module Contents#

class deepluq.metrics_vla.TokenMetrics#
shannon_entropy_list = []#
token_prob = []#
pcs = []#
token_prob_inv = []#
pcs_inv = []#
deepgini = []#
calculate_metrics(logits)#
compute_norm_inv_token_metrics(logits)#

Compute various token-level uncertainty and confidence metrics from model logits, normalize them to [0, 1], and invert selected metrics so that higher values consistently indicate greater uncertainty.

Metrics computed: - Shannon Entropy (normalized): uncertainty measure normalized by log2(num_classes). - Max Token Probability (normalized and inverted): confidence of top predicted token,

normalized and inverted so higher means less confidence.

  • PCS (Prediction Confidence Score) (inverted): difference between top two token probabilities, inverted so higher means more uncertainty.

  • DeepGini (normalized): uncertainty measure normalized by its max possible value.

Parameters:

logits (torch.Tensor) – raw output logits from the model with shape (batch_size, num_classes).

Returns:

four lists of float values rounded to 5 decimals, corresponding to:

[shannon_entropy, max_token_prob_inverted, pcs_inverted, deepgini]

Return type:

list

clear()#
class deepluq.metrics_vla.OutputMetrics#

Compute various instability and variability metrics for robot actions and TCP positions.

Author: Pablo Valle Time : 05/22/2025

VARIABILITY = 4#
static _action_array(actions: List[Dict[str, Any]]) numpy.ndarray#

Convert a list of action dicts to a NumPy array.

Each action dict should contain: - “world_vector” - “rot_axangle” - “gripper”

static _compute_instability(arr: numpy.ndarray, order: int = 1, scale: float = 1.0) numpy.ndarray#

Compute instability metrics by taking successive differences.

Parameters:
  • arr (np.ndarray) – Input array of shape (T, M).

  • order (int) – Number of differences to compute (1=position, 2=velocity, 3=acceleration).

  • scale (float) – Scaling factor for difference magnitude.

Returns:

Instability per dimension (M,).

Return type:

np.ndarray

compute_position_instability(actions: List[Dict[str, Any]]) numpy.ndarray#
compute_velocity_instability(actions: List[Dict[str, Any]]) numpy.ndarray#
compute_acceleration_instability(actions: List[Dict[str, Any]]) numpy.ndarray#
static _tcp_array(poses: List[List[float]]) numpy.ndarray#

Extract TCP positions (x, y, z) from poses.

compute_TCP_position_instability(poses: List[List[float]]) numpy.ndarray#
compute_TCP_velocity_instability(poses: List[List[float]]) numpy.ndarray#
compute_TCP_acceleration_instability(poses: List[List[float]]) numpy.ndarray#
compute_TCP_jerk_instability_gradient(poses: List[List[float]]) numpy.ndarray#

Compute TCP jerk using numerical gradients and return jerk magnitude per time step.

static compute_execution_variability(variability_models: List[Any], image: Any, action_space: Any, instruction: Any, obs: Dict[str, Any], model_name: str) numpy.ndarray#

Compute variability across multiple models’ actions.

Returns:

Standard deviation of actions across models.

Return type:

np.ndarray