Commit 94cf17c7 authored by jameskrw's avatar jameskrw
Browse files

minor

parent bc4ca37d
Loading
Loading
Loading
Loading
+0 −1
Original line number Diff line number Diff line
@@ -4,6 +4,5 @@ from typing import Dict, List, Tuple, Optional, Any, Union

@dataclass
class FrozenLakeServiceConfig(BaseServiceConfig):
    preload_reward_model: Dict[str, Any] = field(default_factory=lambda: {"clip": False})
    device: Dict[str, Any] = field(default_factory=lambda: {"clip": 0})
    use_state_reward: bool = False
 No newline at end of file
+2 −0
Original line number Diff line number Diff line
@@ -65,6 +65,8 @@ class NavigationEnv(BaseEnv):
            "fieldOfView": config.fov,
            "platform": CloudRendering,
            "gpu_device": config.get('gpu_device', 0),
            "server_timeout": 300,
            "server_start_timeout": 300.0,
        }
        
        # Initialize AI2-THOR controller
+22 −1
Original line number Diff line number Diff line
@@ -64,11 +64,31 @@ prompt_templates:
    worldmodeling: ${prompt_templates.default_env.worldmodeling}
  
  maniskill:
    grounding: ${prompt_templates.default_env.grounding}
    grounding: |
      Compare the description of the current state with the groundtruth current state information.
      Answer YES if the description matches the current state information, or NO if it doesn't.

      # Context
      You are evaluating whether an agent's description accurately reflects the actual state. The description must be both correct overall AND specifically relevant to the important elements of the current state. Generic observations (like "player, box and target is on the ground") that don't capture the meaningful relationships and positions in the state are insufficient. The description should demonstrate understanding of the specific configuration and relationships that matter for decision-making.
      Please also tell if the description includes a dict-formatted state information, if not, please answer NO.

      # Groundtruth Current State Information:
      {state_information_dict}

      # State Description:
      {natural_language_description}

      Think step by step and end with your answer.
      Your answer should be within {max_tokens} tokens and in the format of <think>...</think><answer>YES</answer> or <think>...</think><answer>NO</answer>.

    worldmodeling: |
      Compare the prediction of the next state with the groundtruth next state information.
      Answer YES if the prediction accurately matches the next state information, or NO if it doesn't.

      # Context
      You are evaluating whether an agent's prediction of the next state is accurate. The prediction must be both correct overall AND specifically relevant to the important elements of the next state. Generic predictions that don't capture the meaningful changes, relationships, and positions in the state are insufficient. The prediction should demonstrate understanding of the specific configuration and relationships that will result from the action, showing how the state will transform in ways that matter for decision-making.
      Please also tell if the prediction includes a dict-formatted state information, if not, please answer NO.

      Groundtruth Next State Information:
      {state_information_dict}
      
@@ -78,6 +98,7 @@ prompt_templates:
      Please also allow some errors in the prediction, 
      For example, if the prediction is {"red_cube":(15,30,20)} and the groundtruth is {"red_cube":(12,32,21)}, that should be considered correct.
      For x,y coordinates, the error tolerance is 7, and for z coordinates, the error tolerance is 10.
      
      Think step by step and end with your answer.
      Your answer should be within {max_tokens} tokens and in the format of <think>...</think><answer>YES</answer> or <think>...</think><answer>NO</answer>.
  
+3 −0
Original line number Diff line number Diff line
@@ -27,4 +27,7 @@ navigation:
primitive_skill:
  max_workers: 48
  use_state_reward: ${use_state_reward}
sokoban:
  max_workers: 48
  use_state_reward: ${use_state_reward}