最新消息:Welcome to the puzzle paradise for programmers! Here, a well-designed puzzle awaits you. From code logic puzzles to algorithmic challenges, each level is closely centered on the programmer's expertise and skills. Whether you're a novice programmer or an experienced tech guru, you'll find your own challenges on this site. In the process of solving puzzles, you can not only exercise your thinking skills, but also deepen your understanding and application of programming knowledge. Come to start this puzzle journey full of wisdom and challenges, with many programmers to compete with each other and show your programming wisdom! Translated with DeepL.com (free version)

google cloud platform - Make gemini-1.5-flash-002 accesible for my GCloud Run project - Stack Overflow

matteradmin8PV0评论

I am trying a basic script to summarize text:

def generate(self, text_to_summarize):
    vertexai.init(project="<PROJECT_ID", location="MY_REGION")
    model = GenerativeModel(
        "gemini-1.5-flash-002",
        system_instruction=[my_prompt]
    )
    responses = model.generate_content(
        [text_to_summarize],
        stream=True,
    )

    for response in responses:
        print(response.text, end="")

This works as intended locally, using "gemini-1.5-flash-002"

In order to run in gcloud run, I have built the script in a docker container and have deployed it to gcloud run.

Calling the endpoint then fails with error:

"PermissionDenied(\"Permission 'aiplatform.endpoints.predict' denied on resource '//aiplatform.googleapis/projects/<PROJECT-ID>/locations/<REGION>/publishers/google/models/gemini-1.5-flash-002' (or it may not exist).\")"

I have double-checked permissions with gcloud projects get-iam-policy <PROJECT-ID> and see:

bindings:
- members:
  - serviceAccount:service-<CODE>@gcp-sa-vertex-op.iam.gserviceaccount
  role: roles/aiplatform.onlinePredictionServiceAgent
- members:
  - serviceAccount:service-<CODE>@gcp-sa-aiplatform.iam.gserviceaccount
  role: roles/aiplatform.serviceAgent
- members:
  - serviceAccount:<CODE>[email protected]
  - user:<MY-EMAIL>
  role: roles/aiplatform.user
...

I checked the models here and aiplatform.endpoints.predict is a permission for roles/aiplatform.user, so I have permission.

This has led me to conclude the model does not exist. I thought gcloud run would automatically use the gemini flash one as it does locally. I have run

gcloud ai models list --region=<REGION>

and there are no models.

Even trying to deploy that model to my endpoint fails. The code to deploy is:

gcloud ai endpoints deploy-model <MY-ENDPOINT-ID>\
   --model=gemini-1.5-flash-002 \
   --region=<REGION> \
   --display-name="flash-deployment" \
   --machine-type="n1-standard-4"

and this fails with

(gcloud.ai.endpoints.deploy-model) There is an error while getting the model information. Please make sure the model 'projects/<PROJECT-ID>/locations/<REGION>/models/gemini-1.5-flash-002' exists.

I think I have to register the model somewhere, but when I open the model registry and try to "Create" one, it asks me for training data and so on. I do not want to train a new model, just use the flash pretrained one.

Does anyone know how this can be achieved?

Post a comment

comment list (0)

  1. No comments so far