Skip to content

Latest commit

 

History

History
64 lines (52 loc) · 2.67 KB

channel-bench-resnet.md

File metadata and controls

64 lines (52 loc) · 2.67 KB

Prompt for Channel-Bench-ResNet

System Prompt

You are an expert in the field of neural architecture search.

User Prompt

Your task is to assist me in selecting the best channel numbers for a given model architecture. The model will be trained and tested on CIFAR10, and your objective will be to maximize the model's performance on CIFAR10.

The model architecture will be defined as the following.

{
    layer1: nn.Conv2d(in_channels=3, out_channels=channels[0], kernel_size=3, padding=1, bias=False),
    layer2: BottleneckResidualBlock(in_channels=channels[0], bottleneck_channels=channels[1], out_channels=channels[0], stride=1),
    layer3: BottleneckResidualBlock(in_channels=channels[0], bottleneck_channels=channels[2], out_channels=channels[0], stride=1),
    layer4: BottleneckResidualBlock(in_channels=channels[0], bottleneck_channels=channels[3], out_channels=channels[4], stride=2),
    layer5: BottleneckResidualBlock(in_channels=channels[4], bottleneck_channels=channels[5], out_channels=channels[4], stride=1),
    layer6: BottleneckResidualBlock(in_channels=channels[4], bottleneck_channels=channels[6], out_channels=channels[4], stride=1),
    layer7: nn.AdaptiveAvgPool2d(output_size=1),
    layer8: nn.Linear(in_features=channels[4], out_features=10),
}

The implementation of the BottleneckResidualBlock is as follows:

class BottleneckResidualBlock(nn.Module):
    def __init__(self, in_channels, bottleneck_channels, out_channels, stride):
        super().__init__()

        self.stride = stride

        self.block = nn.Sequential(
            nn.Conv2d(in_channels, bottleneck_channels, 3, stride = stride, padding=1, bias=False),
            nn.BatchNorm2d(bottleneck_channels),
            nn.ReLU(inplace=True),
            nn.Conv2d(bottleneck_channels, out_channels, 3, stride = 1, padding=1, bias=False),
            nn.BatchNorm2d(out_channels),
        )
        self.relu = nn.ReLU(inplace=True)

    def forward(self, x):
        if self.stride == 1:
            return self.relu(x + self.block(x))
        else:
            return self.relu(self.block(x))

For the channels variable, the available channel number for each index would be:

{
    channels[0]: [64, 128, 192, 256],
    channels[1]: [64, 128, 192, 256],
    channels[2]: [64, 128, 192, 256],
    channels[3]: [128, 256, 384, 512],
    channels[4]: [128, 256, 384, 512],
    channels[5]: [128, 256, 384, 512],
    channels[6]: [128, 256, 384, 512],
}

Your objective is to define the optimal number of channels for each layer based on the given options above to maximize the model's performance on CIFAR10. Your response should be the channel list consisting of 7 numbers (e.g. [64, 192, ..., 256]).