Hi, I was reading this paragraph:
- This includes the ability for the existing
FullyShardedDataParallelto expose the original parameters (notFlatParameters) viause_orig_params=True, which enables flexible support for multiple parameter groups.
I had a hard time understanding what use_orig_params meant. Does it mean that it allows us to include frozen params alongside trainable params? If that’s the case, do you think a better flag (perhaps for future major version) would be allow_frozen_params instead? Otherwise, I might be misunderstanding (please let me know if that’s the case!)