Two Communications: Speech from camera speaker comes out garbled and severely distorted

I have 2 wyze cams (firmware on android and I need to use two way communications via the app,
On livestreaming, I can hear the person in front of the camera talking fine, but the volume is low.
But when I talk back to them using the app, the audio that comes out of camera’s speaker is very garbled and its almost indecipherable by the people on the other side.

Is this is systemic problem with the WYZE CAM? Whats the problem here and how can I make it better?