how to * get the tokenizer and preprocessor for a given clip * get the visual and textual encoder separately