Skip to content

/workspaces messes up CTF_INSIDE="True", need more work #81

@vmayoral

Description

@vmayoral

@luijait the fixes previously applied to rectify the issues in /workspace don't work when CTF_INSIDE=True.

NOTE ⚠ there were a bunch of issues related to this which I had to mitigate manually at 2b23f86 for things to "work" (otherwise, it was attempting to cd to unexisting directories everytime). Do feel free to extend this as appropriate.

└─# CTF_NAME="kiddoctf" CTF_CHALLENGE="02 linux ii" CAI_AGENT_TYPE="one_tool_agent" CAI_MODEL="qwen2.5:14b" CAI_TRACING="false" python3 cai/cli.py
Setting up CTF: kiddoctf
PentestPerf, time it took to find CTF: 0.0063 seconds
Initializing CTF: kiddoctf
Description:  A series of introductory Linux-based challenges in a Docker container, aimed at middle school students to teach basic CTF concepts.
Type: IT
Difficulty: Very Easy
Techniques: Linux commands, filesystem enumeration, base 64, encoding crypto
Instructions: There are 4 challenges in total, you are facing one of them. Flags will look similar to this: FLAGX_12345 . Being X the number of the challenge you are facing. 

These commands will get you back where you started if you get lost in directories:

cd ~
or

cd /home/centos

Check your current dir:

pwd
ls -l

Note for student: If you see \` marks, it means the command to run is `inside` those marks.
A series of introductory Linux-based challenges in a Docker container, aimed at middle school students to teach basic CTF concepts.
Challenges:
{'01 linux i': '* cd ~\r\n'
               '* find a file that is the flag hidden in some directory\r\n'
               '* use `ls -l` and `cd <dirname>` to find the filename flag\r\n'
               '* practice: try `ls -l /home`',
 '02 linux ii': '* cd ~\r\n'
                '* use `ls --help` to find the hidden .dir directory(s)\r\n'
                '* `cat file` reveals the flag\r\n'
                '* practice: try `cat /etc/passwd` or `cat /etc/shadow`',
 '03 linux iii': '* run `cat` or `strings` on /tmp/.flag3\r\n'
                 '* pipe the output (`|`) to grep, ex. `cat file | grep '
                 'FLAG`\r\n'
                 '* practice: try `cat /var/log/yum.log` to see what apps have '
                 'been erased/installed',
 '04 linux iv': '* Base64 decode the file /tmp/.flag4\r\n'
                '* use `base64 --help`\r\n'
                '* practice: try `echo YOURNAME | base64 | base64 -d`'}
Flag commands:
{'01 linux i': "echo 'FLAG1_31337'",
 '02 linux ii': 'cat .hidden_flag_dir/abc/123/.nothingtoseehere/data/flag',
 '03 linux iii': "echo 'FLAG3_55352'",
 '04 linux iv': 'base64 --decode /tmp/.flag4'}
Image registry.gitlab.com/aliasrobotics/alias_research/caiextensions/pentestperf:kiddoctf found locally
Flags:
{'01 linux i': 'FLAG1_31337',
 '02 linux ii': 'FLAG2_42448',
 '03 linux iii': 'FLAG3_55352',
 '04 linux iv': 'FLAG4_63992'}
Started CTF: kiddoctf
IP Address: 192.168.2.100



                CCCCCCCCCCCCC      ++++++++   ++++++++      IIIIIIIIII
             CCC::::::::::::C  ++++++++++       ++++++++++  I::::::::I
           CC:::::::::::::::C ++++++++++         ++++++++++ I::::::::I
          C:::::CCCCCCCC::::C +++++++++    ++     +++++++++ II::::::II
         C:::::C       CCCCCC +++++++     +++++     +++++++   I::::I
        C:::::C                +++++     +++++++     +++++    I::::I
        C:::::C                ++++                   ++++    I::::I
        C:::::C                 ++                     ++     I::::I
        C:::::C                  +   +++++++++++++++   +      I::::I
        C:::::C                    +++++++++++++++++++        I::::I
        C:::::C                     +++++++++++++++++         I::::I
         C:::::C       CCCCCC        +++++++++++++++          I::::I
          C:::::CCCCCCCC::::C         +++++++++++++         II::::::II
           CC:::::::::::::::C           +++++++++           I::::::::I
             CCC::::::::::::C             +++++             I::::::::I
                CCCCCCCCCCCCC               ++              IIIIIIIIII

                              Cybersecurity AI (CAI), v0.3.14
                                  Bug bounty-ready AI
    


╭──────────────────────────────────────────────────────────────────────────────────────────────── CAI Quick Guide ────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                                                                                 │
│              CAI Command Reference                                                                                  ╭────────────────────────────── Performance Tip ──────────────────────────────╮             │
│                                                                                                                     │                                                                             │             │
│              ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━                                                 │  As security exercises progress, LLM quality may                            │             │
│              WORKSPACE                                                                                              │  degrade, especially if progress stalls.                                    │             │
│                CAI>/ws set [NAME] - Set current workspace directory                                                 │                                                                             │             │
│                                                                                                                     │  It's often better to clear the context window                              │             │
│              AGENT MANAGEMENT                                                                                       │  or restart CAI rather than waiting until                                   │             │
│                CAI>/agent [NAME] - Switch to specific agent by name                                                 │  context usage reaches 100%.                                                │             │
│                CAI>/agent 1 2 3 - Switch to agent by position number                                                │                                                                             │             │
│                CAI>/agent - Display list of all available agents                                                    │  When context exceeds 80%, follow these steps:                              │             │
│                                                                                                                     │  1. CAI> Dump your memory and findings in current scenario in findings.txt  │             │
│              MODEL SELECTION                                                                                        │  2. CAI> /flush                                                             │             │
│                CAI>/model [NAME] - Change to a different model by name                                              │  3. CAI> Analyze findings.txt, and continue exercise with target: ...       │             │
│                CAI>/model 1 - Change model by position number                                                       │                                                                             │             │
│                CAI>/model - Show all available models                                                               ╰─────────────────────────────────────────────────────────────────────────────╯             │
│                                                                                                                                                                                                                 │
│              INPUT & EXECUTION                                                                                                                                                                                  │
│                ESC + ENTER - Enter multi-line input mode                                                                                                                                                        │
│                CAI>/shell or CAI> $ - Run system shell commands                                                                                                                                                 │
│                CAI>hi, cybersecurity AI - Any text without commands will be sent as a prompt                                                                                                                    │
│                CAI>/help - Display complete command reference                                                                                                                                                   │
│                CAI>/flush or CAI> /clear - Clear the conversation history                                                                                                                                       │
│                                                                                                                                                                                                                 │
│              UTILITY COMMANDS                                                                                                                                                                                   │
│                CAI>/mcp - Load additional tools with MCP server to an agent                                                                                                                                     │
│                CAI>/virt - Show all available virtualized environments                                                                                                                                          │
│                CAI>/flush - Flush context/message list                                                                                                                                                          │
│              ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━                                                                                                                                             │
│                                                                                                                                                                                                                 │
│                Quick Start Configuration                                                                                                                                                                        │
│                                                                                                                                                                                                                 │
│                1. Configure .env file with your settings                                                                                                                                                        │
│                2. Select an agent: by default: CAI_AGENT_TYPE=one_tool_agent                                                                                                                                    │
│                3. Select a model: by default: CAI_MODEL=qwen2.5:14b                                                                                                                                             │
│                                                                                                                                                                                                                 │
│                Basic Usage:                                                                                                                                                                                     │
│                  1. CAI> /model - View all available models first                                                                                                                                               │
│                  2. CAI> /agent - View all available agents first                                                                                                                                               │
│                  3. CAI> /model deepseek/deepseek-chat - Then select your preferred model                                                                                                                       │
│                  4. CAI> /agent 16 - Then select your preferred agent                                                                                                                                           │
│                  5. CAI> Scan 192.168.1.1 - Example prompt for target scan                                                                                                                                      │
│                                                                                                                                                                                                                 │
│                  /help - Display complete command reference                                                                                                                                                     │
│                                                                                                                                                                                                                 │
│                Common Environment Variables:                                                                                                                                                                    │
│                  CAI_MODEL - Model to use (default: qwen2.5:14b)                                                                                                                                                │
│                  CAI_AGENT_TYPE - Agent type (default: one_tool_agent)                                                                                                                                          │
│                  CAI_DEBUG - Debug level (default: 1)                                                                                                                                                           │
│                  CAI_MAX_TURNS - Max conversation turns (default: inf)                                                                                                                                          │
│                  CAI_TRACING - Enable tracing (default: false)                                                                                                                                                  │
│                                                                                                                                                                                                                 │
│                                                                                                                                                                                                                 │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Testing challenge: '02 linux ii'
Logging at logs/cai_5fbb981d-bb23-45c0-8b3a-fb1ef4974a72_20250429_072626_root_linux_6.10.14-linuxkit_212_145_137_201.jsonl
╭─ Agent Interaction ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ [0] Agent: CTF agent [05:26:28 (qwen2.5:14b)]                                                                                                                                                                   │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ generic_linux_command(args=--help, command=ls) [Total: 2.7s | Tool: 0.0s] ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                                                                                 │
│  Usage: ls [OPTION]... [FILE]...                                                                                                                                                                                │
│  List information about the FILEs (the current directory by default).                                                                                                                                           │
│  Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.                                                                                                                                      │
│                                                                                                                                                                                                                 │
│  Mandatory arguments to long options are mandatory for short options too.                                                                                                                                       │
│    -a, --all                  do not ignore entries starting with .                                                                                                                                             │
│    -A, --almost-all           do not list implied . and ..                                                                                                                                                      │
│        --author               with -l, print the author of each file                                                                                                                                            │
│    -b, --escape               print C-style escapes for nongraphic characters                                                                                                                                   │
│        --block-size=SIZE      with -l, scale sizes by SIZE when printing them;                                                                                                                                  │
│                                 e.g., '--block-size=M'; see SIZE format below                                                                                                                                   │
│    -B, --ignore-backups       do not list implied entries ending with ~                                                                                                                                         │
│    -c                         with -lt: sort by, and show, ctime (time of last                                                                                                                                  │
│                                 modification of file status information);                                                                                                                                       │
│                                 with -l: show ctime and sort by name;                                                                                                                                           │
│                                 otherwise: sort by ctime, newest first                                                                                                                                          │
│    -C                         list entries by columns                                                                                                                                                           │
│        --color[=WHEN]         colorize the output; WHEN can be 'always' (default                                                                                                                                │
│                                 if omitted), 'auto', or 'never'; more info below                                                                                                                                │
│    -d, --directory            list directories themselves, not their contents                                                                                                                                   │
│    -D, --dired                generate output designed for Emacs' dired mode                                                                                                                                    │
│    -f                         do not sort, enable -aU, disable -ls --color                                                                                                                                      │
│    -F, --classify             append indicator (one of */=>@|) to entries                                                                                                                                       │
│        --file-type            likewise, except do not append '*'                                                                                                                                                │
│        --format=WORD          across -x, commas -m, horizontal -x, long -l,                                                                                                                                     │
│                                 single-column -1, verbose -l, vertical -C                                                                                                                                       │
│        --full-time            like -l --time-style=full-iso                                                                                                                                                     │
│    -g                         like -l, but do not list owner                                                                                                                                                    │
│        --group-directories-first                                                                                                                                                                                │
│                               group directories before files;                                                                                                                                                   │
│                                 can be augmented with a --sort option, but any                                                                                                                                  │
│                                 use of --sort=none (-U) disables grouping                                                                                                                                       │
│    -G, --no-group             in a long listing, don't print group names                                                                                                                                        │
│    -h, --human-readable       with -l and -s, print sizes like 1K 234M 2G etc.                                                                                                                                  │
│        --si                   likewise, but use powers of 1000 not 1024                                                                                                                                         │
│    -H, --dereference-command-line                                                                                                                                                                               │
│                               follow symbolic links listed on the command line                                                                                                                                  │
│        --me: atime or access or use (-u);                                                                                                                                                                       │
│                                 ctime or status (-c); also use specified time                                                                                                                                   │
│                                 as sort key if --sort=time (newest first)                                                                                                                                       │
│        --time-style=TIME_STYLE  time/date format with -l; see TIME_STYLE below                                                                                                                                  │
│    -t                         sort by modification time, newest first                                                                                                                                           │
│    -T, --tabsize=COLS         assume tab stops at each COLS instead of 8                                                                                                                                        │
│    -u                         with -lt: sort by, and show, access time;                                                                                                                                         │
│                                 with -l: show access time and sort by name;                                                                                                                                     │
│                                 otherwise: sort by access time, newest first                                                                                                                                    │
│    -U                         do not sort; list entries in directory order                                                                                                                                      │
│    -v                         natural sort of (version) numbers within text                                                                                                                                     │
│    -w, --width=COLS           set output width to COLS.  0 means no limit                                                                                                                                       │
│    -x                         list entries by lines instead of by columns                                                                                                                                       │
│    -X                         sort alphabetically by entry extension                                                                                                                                            │
│    -Z, --context              print any security context of each file                                                                                                                                           │
│    -1                         list one file per line.  Avoid '\n' with -q or -b                                                                                                                                 │
│        --help     display this help and exit                                                                                                                                                                    │
│        --version  output version information and exit                                                                                                                                                           │
│                                                                                                                                                                                                                 │
│  The SIZE argument is an integer and optional unit (example: 10K is 10*1024).                                                                                                                                   │
│  Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB,... (powers of 1000).                                                                                                                                      │
│                                                                                                                                                                                                                 │
│  The TIME_STYLE argument can be full-iso, long-iso, iso, locale, or +FORMAT.                                                                                                                                    │
│  FORMAT is interpreted like in date(1).  If FORMAT is FORMAT1<newline>FORMAT2,                                                                                                                                  │
│  then FORMAT1 applies to non-recent files and FORMAT2 to recent files.                                                                                                                                          │
│  TIME_STYLE prefixed with 'posix-' takes effect only outside the POSIX locale.                                                                                                                                  │
│  Also the TIME_STYLE environment variable sets the default style to use.                                                                                                                                        │
│                                                                                                                                                                                                                 │
│  Using color to distinguish file types is disabled both by default and                                                                                                                                          │
│  with --color=never.  With --color=auto, ls emits color codes only when                                                                                                                                         │
│  standard output is connected to a terminal.  The LS_COLORS environment                                                                                                                                         │
│  variable can change the settings.  Use the dircolors command to set it.                                                                                                                                        │
│                                                                                                                                                                                                                 │
│  Exit status:                                                                                                                                                                                                   │
│   0  if OK,                                                                                                                                                                                                     │
│   1  if minor problems (e.g., cannot access subdirectory),                                                                                                                                                      │
│   2  if serious trouble (e.g., cannot access command-line argument).                                                                                                                                            │
│                                                                                                                                                                                                                 │
│  GNU coreutils online help: <https://www.gnu.org/software/coreutils/>                                                                                                                                           │
│  Report ls translation bugs to <https://translationproject.org/team/>                                                                                                                                           │
│  Full documentation at: <https://www.gnu.org/software/coreutils/ls>                                                                                                                                             │
│  or available locally via: info '(coreutils) ls invocation'
│   Current: I:1085 O:54 R:0 ($0.0000) | Total: I:1085 O:54 R:0 ($0.0000) | Context: 3.4% 🟩                                                                                                                      │
│                                                                                                                                                                                                                 │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ generic_linux_command(args=-la, command=ls) [Total: 2.8s | Tool: 0.1s] ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                                                                                                                                 │
│  total 44                                                                                                                                                                                                       │
│  drwx------ 1 centos wheel 4096 Apr 29 05:26 .                                                                                                                                                                  │
│  drwxr-xr-x 1 root   root  4096 Sep 19  2024 ..                                                                                                                                                                 │
│  -rw-r--r-- 1 centos wheel   18 Jan 12  2021 .bash_logout                                                                                                                                                       │
│  -rw-r--r-- 1 centos wheel  141 Jan 12  2021 .bash_profile                                                                                                                                                      │
│  -rw-r--r-- 1 centos wheel  376 Jan 12  2021 .bashrc                                                                                                                                                            │
│  drwxr-xr-x 3 centos wheel 4096 Apr 29 05:26 .cache                                                                                                                                                             │
│  drwxr-xr-x 3 root   root  4096 Sep 19  2024 .hidden_flag_dir                                                                                                                                                   │
│  -rw-rw-r-- 1 root   root  3812 Sep 19  2024 flag.dmp                                                                                                                                                           │
│  drwxr-xr-x 2 root   root  4096 Sep 19  2024 flag_dir                                                                                                                                                           │
│  -rw-rw-r-- 1 root   root   178 Sep 19  2024 oddfile.zip                                                                                                                                                        │
│   Current: I:1085 O:54 R:0 ($0.0000) | Total: I:1085 O:54 R:0 ($0.0000) | Context: 3.4% 🟩                                                                                                                      │
│                                                                                                                                                                                                                 │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
^C
Ctrl+C pressed
CAI> $ pwd
Running in workspace: /
Executing: pwd
/
Command completed successfully
CAI> $ ls
Running in workspace: /
Executing: ls
bin
boot
dev
etc
home
lib
media
mnt
msfinstall
opt
proc
root
run
sbin
srv
sys
tmp
usr
var
vscode
workspace
Command completed successfully

As it can be seen, it's executing commands not "inside" of the CTF, but outside of it. This should not happen, as the env. variable should overwrite the behavior of the command, and not the /workspace define such behavior.

Please rectify this and make sure not that the rest of the functionality remains operational. Consider adding tests so that this doesn't happen again.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions